---
output: 
  pdf_document:
    citation_package: natbib
    keep_tex: true
    latex_engine: pdflatex
    template: header.tex
title: "Public Preferences over Changes to the Composition of Government Tax Revenue"
abstract: "How governments raise tax revenue is at the core of domestic political conflict.  Public opinion towards taxation is measured generally and qualitatively by many surveys, but previous research has not closely linked public preferences to the budget problem faced by governments of how best to raise or cut a marginal quantity of revenue.  We present results from a novel tax preference experiment in which UK respondents are given choices over different tax 'levers' that are expected to raise or cut equal revenue.  We find that while different tax levers vary substantially in their popularity, there is a 'hidden consensus' regarding different tax levers across income levels and partisanship of respondents."
#thanks: "The data, replication instructions, and the data’s codebook can be found at https://dx.doi.org/"
author:
- name: Lucy Barnes
  affiliation: University College London
- name: Julia de Romémont
  affiliation: University College London
- name: Benjamin E Lauderdale
  affiliation: University College London
keywords: ""
# wordcount: 
date: "`r format(Sys.time(), '%B %d, %Y')`"
geometry: margin=1in
fontfamily: mathpazo
fontsize: 11pt
# papersize: a4paper
spacing: double
bibliography: tax-spend-experiment.bib
biblio-style: apsr
---

```{r,include=FALSE}
library(knitr)
library(kableExtra)
library(rstan)
library(MASS)
library(corrplot)
library(foreign)
library(abind)
library(car)
library(latex2exp)
library(ggplot2)
library(ggrepel)

## LOAD LIBRARIES AND DECLARE FUNCTIONS

knitr::opts_chunk$set(cache=TRUE,
                      dpi = 300,
     collapse=TRUE, eval=TRUE, echo=FALSE, 
     warning=FALSE, message=FALSE, dev.args=list(bg='transparent'))
set.seed(42)

round2 <- function(x) format(round(x,2),nsmall=2)
fadecol <- function(col,alpha){
    components <- col2rgb(col)
    return(rgb(components[1,]/255,components[2,]/255,components[3,]/255,alpha=alpha))
}
appendplus <- function(x) if (x > 0) return (paste0("+",x)) else return (paste0(x))

# set size of posterior simulations
N_warmup <- 250
N_iterations <- 1000
N_chains <- 2

# set seeds for R/Stan to use for stable posteriors across runs

R_seed <- 24364237
stan_seed <- 534762454

rstan_options(auto_write = FALSE) # recompile models each time
options(mc.cores = N_chains)
extract_sampler_parameters <- function(posterior,chain) attr(posterior@sim$samples[[chain]],"sampler_params")

# rerun models

new_run_null <- new_run_symmetric <- new_run_asymmetric <- new_run_procon <- new_run_biglittle <- new_run_labcon <- new_run_leaveremain <- new_run_gender <- new_run_income <- new_run_degree <- new_run_party <- new_run_income <- new_run_income2 <- new_run_multivariate <- FALSE

# misc

lab_col <- "#E4003B"
con_col <- "#0087DC"
ld_col <- "#FAA61A"

remain_col <- "#FEC321"
leave_col <- "#0853A5"

long_labels <- c("Alcohol & tobacco duties", 
                 "Capital gains tax rate", 
                 "Council tax",
                 "Corporation tax rate",
                 "Fuel duties",
                 "Inheritance tax rate",
                 "Inheritance tax threshold", 
                 "SI contributions: main employee rate",
                 "SI contributions: higher employee rate",
                 "SI contributions: main employer rate",
                 "SI contributions: main self-employed rate",
                 "SI contributions: employee allowance",
                 "SI contributions: higher employee rate threshold",
                 "SI contributions: employer allowance",
                 "SI contributions: self-employed allowance",
                 "Income tax: top rate",
                 "Income tax: main rate",
                 "Income tax: higher rate ",
                 "Income tax: higher rate threshold",
                 "Income tax: personal allowance",
                 "Property transaction tax rates",
                 "Property transaction tax threshold",
                 "VAT standard rate")

```

# Introduction

Collecting taxes is one of the most fundamental actions of government, and decisions about how to raise revenue have important consequences for distribution and growth. However, we know relatively little about how citizens would prefer government revenues to be raised: which taxes are popular (or less unpopular) and with whom. The burgeoning experimental literature on public tax policy preferences has largely neglected these questions of the tax mix, while scholarship on the tax mix has sometimes overlooked public opinion.

Inattention to public preferences over how tax revenue is raised is surprising in light of canonical political economy models highlighting the optimisation problem that balances political satisfaction and revenue goals [@hettich1984positive]. From a policy perspective, political science has produced little direct evidence regarding the "dissatisfaction prices" of different revenue sources, a critical question in a time of high public deficits and rising future spending pressures. 

We study preferences over revenue-equivalent tax changes in the UK. We propose marginal changes to actually-existing taxes to a nationally representative sample of voters. Our survey experiment presents a choice between randomly paired possible changes to two different taxes at a time, specifying the quantitative change needed for each tax to generate the same revenue change. We model respondents' choices following a Bradley-Terry framework [@bradley1952rank] to estimate the *relative* popularity of different revenue-equivalent changes to the tax structure.

This empirical exercise makes three important contributions. First, we provide a comprehensive description of preferences over the balance of all the major taxes in the UK system, providing rare empirical evidence on public opinion over the tax mix. The differences in popularity between the relatively preferred versus disliked taxes suggests that there is space in the UK tax system for majority-popular reforms. Second, we are able to separate preferences over the composition of taxation from preferences over its level. This reveals a hidden consensus among voters over where revenue should be raised. While partisanship and material interest may generate disagreement over the appropriate _level_ of taxation, there is widespread agreement on its _composition_.<!--[^hiddenconsensus] Substantively, this agreement consists of relative support for progressive taxes [@limberg2019whats].

[^hiddenconsensus]: Another example of such hidden consensus was documented by @hainmueller2015hidden about attitudes towards immigrants in the US.  They show that while there may be disagreement about how many immigrants there should be, there is little disagreement about which kinds of immigrants are more or less desirable.-->

Finally, our approach contributes to the emerging experimental literature on preferences over taxation [@kneafsey2022role; @ballard-rosa2017structure], expanding its scope to consider the composition of revenue collection across a wide range of taxes.  Understanding public tax attitudes through this cross-tax lens is an important complement to these studies which often focus on explaining the unpopularity of certain taxes -- especially those with redistributive benefits [@scheve2021equal; @ansell2023wealth] -- but which do not allow for the even lower popularity of raising revenue through less progressive channels. 


# Tax Composition and Public Preferences

Our theoretical inspiration comes primarily from an old public choice approach which sets the political resistance generated by different taxes against the revenues generated from each tax base [@hettich1984positive]. In the original model, the marginal pain of a pound paid in tax is assumed equal across taxes, but increasing non-linearly in the rate. Additional political costs arise from (different) administrative burdens across tax bases. Balancing revenue gains with political costs implies a diversified tax base, due to the increasing marginal costs, with higher relative reliance on easily-adminstered taxes. However, to our knowledge, there have been no empirical calibrations of these popularity costs.[^taxsideonly] 

[^taxsideonly]: If the political costs of taxation depend on the benefits it finances, isolating taxation is a consequential simplification. However, this mirrors the common simplification of considering expenditure alone. Assuming that the spending profile will not change with a tax change is empirically realistic and implicit in our approach. 

Citizens may also dislike some taxes more than others for reasons beyond financial and administrative burden, as highlighted in existing research. Particular attention has been given to visibility [@wilensky2002rich], fairness [@scheve2021equal], and progressivity [@prasad2005politics]. However, the generality of these categories, and the potential for slippage between tax design and voter perception, mean that they do not provide strong expectations about attitudes towards specific taxes. 

On visibility, we follow @martin2021what in the view that attributions of visibility are typically based on untested assumptions, and sometimes on circular reasoning, where opposition to a tax is cited as an indication of its visibility, and visibility given as the reason for opposition. Where more specific predictions are made, visibility arguments often derive from idiosyncratic features of the United States tax system, which has received the most scholarly attention [@campbell2018tax]. 

Equally, the perceived fairness of a tax seems intuitively likely to affect its popularity, but what fairness consists in is indeterminate. Some accounts point to "equal treatment" [@scheve2021equal], but countervailing evidence points to fairness as the "ability to pay" [@daunton2002just], inherently requiring _unequal_ treatment. Similarly, misperceptions of how taxes actually work can lead to slippage from what voters might think fair under full information [@kuziemko2015how]. This makes it difficult to hypothesize in advance which taxes should elicit greater support on fairness grounds. 

The one exception here, perhaps, is to expect progressive taxes to be relatively popular. A large body of work finds widespread support for the principle of progressivity [@barnes2015size; @limberg2019whats], and majority support for progressive changes from the status quo [@ballard-rosa2017structure]. <!-- However, preferences for progressivity within the income tax may not equate to preferences for more progressive taxes in general as revenue sources. ---> 

But studies of support for progressivity have focused more on variation between people than comparisons to other taxes. Progressivity preferences have been shown to be highly structured by income [@beramendi2016who], but this has not been cleanly empirically separated from this tax-level effect, since progressivity is typically presented as higher taxes on the rich, but not also lower taxes on the poor.

Meanwhile, in the literature on the tax mix, considering public opinion over types of taxes directly is rare. The central explanations of variations across countries (and over time) are located in political institutions and the relative power they give to groups with different interests [@kemmerling2021domestic]. These preferences are inferred from the material positions of these groups. Those with lower incomes "should favor a more progressive tax system, whereas richer voters should reject tax progressivity" [@haffert2021size, p.99]. Since they consume a larger share of their incomes, the less well-off should be less supportive of taxes on consumption. Symmetrically, (progressive) taxes on income and capital fall more heavily on the better-off [@timmons2005fiscal]. These materialist building blocks underpin the taxes that different parties and organized interests <!-- (employers, unions)  --> endorse, but constituents' preferences are assumed rather than investigated. 

The prediction of variation in tax mix preferences across income and partisan groups motivates our empirical verification. 

# Empirical Approach

We examine preferences over tax composition at the margin of current UK tax policy, and consider variation in preferences by income and party vote, in a novel survey. Our design directly tracks the quantities we want to estimate. Our interest in tax composition means we want to consider preferences over budget-equivalent propositions. Second, we want to make sure that the comparisons we analyse are quantitatively informed. Otherwise, people may overestimate the feasibility of raising revenues from certain taxes [@johnson2023turbocharged]. Third, we want to elicit preferences over a comprehensive set of tax levers, rather than (only) those most salient to researchers. Taken together, these three considerations point to asking respondents their opinions on revenue equivalent increases (or decreases) to as many existing taxes as possible.

We are able to do this in the UK thanks to the annual publication (by HMRC, the central tax authority) of the revenue effects of indicative changes to major national taxes: Income Tax, Corporation Tax, Capital Gains Tax, Inheritance Tax and National Insurance contributions, as well as Stamp Duty Land Tax^[Taxes on property transactions.], duties on alcohol, tobacco and fuel, and VAT rates. Where possible, the revenue estimates incorporate estimates of taxpayers' behavioural responses [@HMRC2021direct]. The data cover major thresholds as well as rates. We used the figures from June 2021 to calculate the changes to 23 tax levers implied by the same (£1 billion) revenue change from the status quo.[^levertexts] This incremental approach is similar how tax policy tends to be made, through small adjustments to existing revenue levers [@rose1987taxation].

[^levertexts]: A list of these, descriptions of the status quo, and of the proposed changes (as used in the experiment) can be found in the [appendix](#treatments).

```{r,include=FALSE} 
library(haven)
library(lubridate)
#dates <- read_sav("../data/P_lse_tax_experiment_Sept21_client.sav")[,c(2,3)]
#dates$time <-  as.numeric(as.duration(interval(dates$starttime,dates$endtime)),"minutes")

# Merge data on treatments into survey data

survey <- suppressWarnings(read.spss("../data/P_lse_tax_experiment_Sept21_client_with_timing.sav",to.data.frame=TRUE))
treatments <- read.csv("../data/survey_prompts_final.csv")

survey_treatments_a <- treatments[as.numeric(survey$exp3a_csv_row_seen),]
colnames(survey_treatments_a) <- paste0("a_",colnames(survey_treatments_a))
survey_treatments_b <- treatments[as.numeric(survey$exp3bexp3b),]
colnames(survey_treatments_b) <- paste0("b_",colnames(survey_treatments_b))

survey <- cbind(survey,survey_treatments_a,survey_treatments_b)

survey$choice <- car::recode(as.character(survey$q1_exp3),'"Option A"=1;"Option B"=0;"I think both of these changes are equally good or bad"=0.5;"Don\'t know"=0.5',as.numeric = TRUE)
survey$choice_sign <- 2*(survey$a_revenueDirection == "increase") - 1

survey$a_lever <- factor(survey$a_lever)
survey$b_lever <- factor(survey$b_lever,levels=levels(survey$a_lever))

lever_labels <- levels <- levels(survey$a_lever)
names(levels) <- long_labels

treatments$lever_long <- forcats::fct_recode(treatments$lever, !!!levels)

survey$vote_19 <- car::recode(survey$pastvote_ge_2019,'"Conservative"="Con";"Labour"="Lab";"Liberal Democrat"="LD";"Scottish National Party (SNP)"="Other";"Plaid Cymru"="Other";"Brexit Party"="Other";"Green"="Other";"Other"="Other";else="None"')


```

We presented `r nrow(survey)` respondents with one pairwise choice between tax changes.[^forcedchoicevalid] Our survey was fielded by YouGov to a nationally representative sample of UK adults between the 4$^{th}$ and the 14$^{th}$ of October 2021. Each response is a choice between two reforms relative to the pre-existing baseline, and each proposal includes the headline change, an account of how the relevant tax works, and the size of the change required to raise or cut the required revenue. Figure \ref{uktaxsummary} shows an example choice, as delivered to respondents.

[^forcedchoicevalid]: In comparisons of different types of survey-experimental approaches to behavioural benchmarks, paired choice designs like this one tend perform the best [@hainmueller2015validating]. 

\begin{figure}[t]
\includegraphics[width=\textwidth]{screenshot_without_arguments_shiny.jpg}
\caption{Survey Experiment Prompt Example. The direction of the change (increase/decrease) and the two taxes proposed are randomised across the choices. The size of the change to the tax is determined by the change necessary to change the revenue yield by £1 billion. \label{uktaxsummary}}
\end{figure}

Our presentations are different to the way citizens typically encounter tax proposals. In public debate, there is usually no counterfactual budget-equivalent option to change another tax instead. Tax reform proposals also typically provide less practical explanation, and more overt normative framing. It is not our concern here to ascertain the effects of framing on tax popularity [it matters, @mccaffery2004framing]. Rather, we try to elicit any views the public may have on the underlying budget problem, where revenue equivalencies are critical. Budget-equivalent alternative proposals reflect an important feature of political reality, if one less commonly presented to the public.^[To the chagrin of economists [@dilnot2023impartiality].]

## Basic Response Statistics and Task Complexity

Of `r length(survey$choice)` responses to our experiment, `r sum(survey$choice == 1)` endorse proposal $A$ and `r sum(survey$choice == 0)` endorse proposal $B$. `r sum(survey$choice == 0.5)` are neutral responses, of which `r sum(survey$q1_exp3 == "I think both of these changes are equally good or bad")` express "I think both of these changes are equally good or bad" while `r sum(survey$q1_exp3 == "Don't know")` "Don't know".[^rtypepcts]  The latter may include respondents who failed to engage with the task, but in real politics, individuals equally fail to engage with the task.<!--[^responseratesbyvoted19]-->  We retain both neutral responses, rather than dropping respondents, to maintain representativeness.  Higher rates of neutral responses for particular taxes simply make these less likely to be estimated as especially popular or unpopular. 

[^rtypepcts]: The overall shares choosing one of the two proposals, that the two are equal, and "don't know" are `r sum(100*round(prop.table(xtabs(W8~q1_exp3,data=survey)),2)[1:2])`%, `r 100*round(prop.table(xtabs(W8~q1_exp3,data=survey)),2)[3]`% and `r 100*round(prop.table(xtabs(W8~q1_exp3,data=survey)),2)[4]`%, respectively. 

<!--[^responseratesbyvoted19]: The rate of "Don't know" responses was `r 100*round(prop.table(xtabs(W8~I(voted_ge_2019 == "Yes, voted")+q1_exp3,data=survey),1)[1,4],2)`% among 2019 non-voters versus `r 100*round(prop.table(xtabs(W8~I(voted_ge_2019 == "Yes, voted")+q1_exp3,data=survey),1)[2,4],2)`% among 2019 voters.-->

The extent of the neutral responses is understandable given that the random pairwise comparisons yield many comparisons that even well-informed individuals might not have strong views about.[^morechecks]  We see some evidence of variation in neutral response rates by the complexity of the choice.[^responseratebylever] However, some real tax changes *would be* complex, and it is of substantive interest if that yields neutrality.  What we ask of respondents is still less complicated than many applications in the literature [for an example on the spending side, see @bonica2015measuring].  

[^responseratebylever]: There are more neutral and don't know responses in comparisons that include National Insurance tax levers, and relatively low for comparisons that include simpler (e.g. alcohol and tobacco tax) levers. Levers with a high share of don't know responses also have a higher share (on average) of "equally good or bad" responses. 

[^morechecks]: We provide further descriptive statistics on engagement in the [appendix](#attentionchecks).


## Models for Tax Preference Choices

We build a series of models to summarize the data. Using $Y_i$ to denote respondent $i$'s choice, we code responses as follows:

- $Y_i = 1$ if respondent prefers A
- $Y_i = 0.5$ if respondent gives a neutral response
- $Y_i = 0$ if respondent prefers B.

This allows us to interpret differences on the scale of proportions of respondents preferring one tax option to another, while retaining the neutral responses.

Following a generalized Bradley-Terry model framework, we model the expected value of $Y_i$ as a function of the competing "popularities" $\pi_{j}$ of different tax change proposals $j$.  With proposals $j \in A, B$, this can be written:  $$E\left[Y_i\right] = \alpha + \pi_{iA} - \pi_{iB}.$$ 
$\alpha$ is the expected value of $Y_i$ when the two proposals are equally popular, i.e. if $\pi_{iA} = \pi_{iB}$.^[$\alpha$ can be thought of as the advantage of a proposal being option $A$ vs option $B$, irrespective of content. We do not find any evidence that $\alpha$ deviates from $0.5$ (no advantage) in our data.]  Note that the popularities in this model are only identified relative to one another: pairwise comparison data only yields information about relative, not absolute, popularity of options.  Full identification and estimation details for our baseline and variant models are in the [appendix](#modelspec).


```{r}


stan_code_null <- "

data {
    int<lower = 1> N_responses; 
    real<lower = 0,upper=1> Y[N_responses];
    real W[N_responses]; // Survey weights
}

parameters {
    real alpha;
    real<lower = 0> sigma;
}

transformed parameters {
    real mu[N_responses];
    for (i in 1:N_responses) mu[i] = alpha;
}

model {
    for (i in 1:N_responses) target += W[i] * normal_lpdf(Y[i] | mu[i], sigma);
}

generated quantities {

}

"



stan_code_X <- "
 
data {
    int<lower = 1> N_responses; 
    int<lower = 1> N_levers; 
    int<lower = 1> N_covars; 
    real<lower = 0,upper=1> Y[N_responses];
    int<lower = 1, upper = N_levers> L[N_responses,2]; 
    int<lower = -1, upper = 1> S[N_responses];  
    matrix[N_responses,N_covars] X; 
    real W[N_responses]; // Survey weights
}

parameters {
    real alpha;
    matrix[N_levers,N_covars] beta_std;
    real<lower = 0> sigma; // response variation
    real<lower = 0> sigma_beta[N_covars]; // hierarchical variation in average tax lever preferences
}

transformed parameters {
    real mu[N_responses];
    matrix[N_levers,N_covars] beta;
    
    for (j in 1:N_levers) for (l in 1:N_covars) beta[j,l] = beta_std[j,l] * sigma_beta[l];
    
    for (i in 1:N_responses) mu[i] = alpha + S[i]*(beta[L[i,1]] * X[i]') - S[i]*(beta[L[i,2]] * X[i]');
}


model {
    for (i in 1:N_responses) target += W[i] * normal_lpdf(Y[i] | mu[i], sigma);
    for (j in 1:N_levers) for (l in 1:N_covars) beta_std[j,l] ~ normal(0,1);
}

generated quantities {

}

"


```



```{r} 

# Null Model

stan_data_null <- list(
    W = as.numeric(survey$W8),
    Y = as.numeric(survey$choice),
    N_responses = nrow(survey)
)


if (new_run_null) {

    set.seed(R_seed)
    
    posterior_null <- stan(
        model_code = stan_code_null, 
        data = stan_data_null,
        pars=c(
            "alpha","sigma"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_null,stan_data_null,file="posterior_null.Rdata")
    
}  

# Baseline Model

stan_data_symmetric <- list(
    W = as.numeric(survey$W8),
    Y = as.numeric(survey$choice),
    L = cbind(as.numeric(survey$a_lever),as.numeric(survey$b_lever)), # tax levers for A and B
    S = survey$choice_sign, # sign of tax change
    X = cbind(rep(1,nrow(survey))),
    N_responses = nrow(survey),
    N_levers = nlevels(survey$a_lever)
)
stan_data_symmetric$N_covars <- ncol(stan_data_symmetric$X)

if (new_run_symmetric) {

    set.seed(R_seed)
    posterior_symmetric <- stan(
        model_code = stan_code_X, 
        data = stan_data_symmetric,
        pars=c(
            "alpha","beta","sigma","sigma_beta"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_symmetric,stan_data_symmetric,file="posterior_symmetric.Rdata")
    
}

# Income Model

stan_data_income <- list(
    W = as.numeric(survey$W8),
    Y = as.numeric(survey$choice),
    L = cbind(as.numeric(survey$a_lever),as.numeric(survey$b_lever)), # tax levers for A and B
    S = survey$choice_sign, # sign of tax change
    X = cbind(rep(1,nrow(survey)),
              is.element(survey$profile_gross_household,c("£45,000 to £49,999 per year","£50,000 to £59,999 per year","£60,000 to £69,999 per year","£70,000 to £99,999 per year","£100,000 to £149,999 per year","£100,000 to £149,999 per year","£150,000 and over")),
              is.element(survey$profile_gross_household,c("Don't know","Prefer not to answer"))
              ),
    N_responses = nrow(survey),
    N_levers = nlevels(survey$a_lever)
)
stan_data_income$N_covars <- ncol(stan_data_income$X)


if (new_run_income) {

    set.seed(R_seed)
    posterior_income <- stan(
        model_code = stan_code_X, 
        data = stan_data_income,
        pars=c(
            "alpha","beta","sigma","sigma_beta"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_income,stan_data_income,file="posterior_income.Rdata")
    
} 

# LabCon Model

isLabCon19 <- is.element(survey$pastvote_ge_2019,c("Conservative","Labour"))

stan_data_labcon <- list(
    W = as.numeric(survey$W8[isLabCon19]),
    Y = as.numeric(survey$choice[isLabCon19]),
    L = cbind(as.numeric(survey$a_lever),as.numeric(survey$b_lever))[isLabCon19,], # tax levers for A and B
    S = survey$choice_sign[isLabCon19], # sign of tax change
    X = cbind(rep(1,sum(isLabCon19)),
      (survey$pastvote_ge_2019[isLabCon19] == "Conservative")
      ),
    N_responses = nrow(survey[isLabCon19,]),
    N_levers = nlevels(survey$a_lever)
)
stan_data_labcon$N_covars <- ncol(stan_data_labcon$X)


if (new_run_labcon) {

    set.seed(R_seed)
    posterior_labcon <- stan(
        model_code = stan_code_X, 
        data = stan_data_labcon,
        pars=c(
            "alpha","beta","sigma","sigma_beta"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_labcon,stan_data_labcon,file="posterior_labcon.Rdata")
    
} 

# Asymmetric Model

stan_data_asymmetric <- list(
    W = as.numeric(survey$W8),
    Y = as.numeric(survey$choice),
    L = cbind(as.numeric(survey$a_lever),as.numeric(survey$b_lever)), # tax levers for A and B
    S = survey$choice_sign, # sign of tax change
    X = cbind(rep(1,nrow(survey)),
              survey$choice_sign),
    N_responses = nrow(survey),
    N_levers = nlevels(survey$a_lever)
)
stan_data_asymmetric$N_covars <- ncol(stan_data_asymmetric$X)


if (new_run_asymmetric) {

    set.seed(R_seed)
    posterior_asymmetric <- stan(
        model_code = stan_code_X, 
        data = stan_data_asymmetric,
        pars=c(
            "alpha","beta","sigma","sigma_beta"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_asymmetric,stan_data_asymmetric,file="posterior_asymmetric.Rdata")
    
} 

# Big vs Little Model

stan_data_biglittle <- list(
    W = as.numeric(survey$W8),
    Y = as.numeric(survey$choice),
    L = cbind(as.numeric(survey$a_lever),as.numeric(survey$b_lever)), # tax levers for A and B
    S = survey$choice_sign, # sign of tax change
    X = cbind(as.numeric((survey$a_revenueDirection == "cut")),
      as.numeric((survey$a_revenueSize == "£10 billion") & (survey$a_revenueDirection == "cut")),
       as.numeric((survey$a_revenueDirection == "increase")),
      as.numeric((survey$a_revenueSize == "£10 billion") & (survey$a_revenueDirection == "increase"))
      ),
    N_responses = nrow(survey),
    N_levers = nlevels(survey$a_lever)
)
stan_data_biglittle$N_covars <- ncol(stan_data_biglittle$X)


if (new_run_biglittle) {

    set.seed(R_seed)
    posterior_biglittle <- stan(
        model_code = stan_code_X, 
        data = stan_data_biglittle,
        pars=c(
            "alpha","beta","sigma","sigma_beta"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_biglittle,stan_data_biglittle,file="posterior_biglittle.Rdata")
    
} 

# Argument Model

stan_data_procon <- list(
    W = as.numeric(survey$W8),
    Y = as.numeric(survey$choice),
    L = cbind(as.numeric(survey$a_lever),as.numeric(survey$b_lever)), # tax levers for A and B
    S = survey$choice_sign, # sign of tax change
    X = cbind(as.numeric((survey$a_revenueDirection == "cut")),
              as.numeric((survey$qsplit_arguments== "the pro arguments for both A and B") & (survey$a_revenueDirection == "cut")),
              as.numeric((survey$qsplit_arguments== "the con arguments for both A and B") & (survey$a_revenueDirection == "cut")),
              as.numeric((survey$a_revenueDirection == "increase")),
              as.numeric((survey$qsplit_arguments== "the pro arguments for both A and B") & (survey$a_revenueDirection == "increase")),
              as.numeric((survey$qsplit_arguments== "the con arguments for both A and B") & (survey$a_revenueDirection == "increase"))
    ),
    N_responses = nrow(survey),
    N_levers = nlevels(survey$a_lever)
)
stan_data_procon$N_covars <- ncol(stan_data_procon$X)


if (new_run_procon) {

    set.seed(R_seed)
    posterior_procon <- stan(
        model_code = stan_code_X, 
        data = stan_data_procon,
        pars=c(
            "alpha","beta","sigma","sigma_beta"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_procon,stan_data_procon,file="posterior_procon.Rdata")
    
} 

# Leave vs Remain Model

isLR16 <- is.element(survey$pastvote_EURef,c("I voted to Remain","I voted to Leave"))

stan_data_leaveremain <- list(
    W = as.numeric(survey$W8[isLR16]),
    Y = as.numeric(survey$choice[isLR16]),
    L = cbind(as.numeric(survey$a_lever),as.numeric(survey$b_lever))[isLR16,], # tax levers for A and B
    S = survey$choice_sign[isLR16], # sign of tax change
    X = cbind(rep(1,sum(isLR16)),
      (survey$pastvote_EURef[isLR16] == "I voted to Leave")
      ),
    N_responses = nrow(survey[isLR16,]),
    N_levers = nlevels(survey$a_lever)
)
stan_data_leaveremain$N_covars <- ncol(stan_data_leaveremain$X)


if (new_run_leaveremain) {

    set.seed(R_seed)
    posterior_leaveremain <- stan(
        model_code = stan_code_X, 
        data = stan_data_leaveremain,
        pars=c(
            "alpha","beta","sigma","sigma_beta"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_leaveremain,stan_data_leaveremain,file="posterior_leaveremain.Rdata")
    
} 

# Turnout Model

stan_data_turnout <- list(
    W = as.numeric(survey$W8),
    Y = as.numeric(survey$choice),
    L = cbind(as.numeric(survey$a_lever),as.numeric(survey$b_lever)), # tax levers for A and B
    S = survey$choice_sign, # sign of tax change
    X = cbind(rep(1,nrow(survey)),
              is.element(survey$voted_ge_2019,c( "Yes, voted"))),
    N_responses = nrow(survey),
    N_levers = nlevels(survey$a_lever)
)
stan_data_turnout$N_covars <- ncol(stan_data_turnout$X)


if (new_run_degree) {

    set.seed(R_seed)
    posterior_turnout <- stan(
        model_code = stan_code_X, 
        data = stan_data_turnout,
        pars=c(
            "alpha","beta","sigma","sigma_beta"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_turnout,stan_data_turnout,file="posterior_turnout.Rdata")
    
}

# 2019 vote model

stan_data_vote19 <- list(
    W = as.numeric(survey$W8),
    Y = as.numeric(survey$choice),
    L = cbind(as.numeric(survey$a_lever),as.numeric(survey$b_lever)), # tax levers for A and B
    S = survey$choice_sign, # sign of tax change
    X = cbind(rep(1,nrow(survey)),
              is.element(survey$vote_19,c("Lab")),
              is.element(survey$vote_19,c("LD")),
              is.element(survey$vote_19,c("Other")),
              is.element(survey$vote_19,c("None"))
              ),
    N_responses = nrow(survey),
    N_levers = nlevels(survey$a_lever)
)
stan_data_vote19$N_covars <- ncol(stan_data_vote19$X)


if (new_run_party) {

    set.seed(R_seed)
    posterior_vote19 <- stan(
        model_code = stan_code_X, 
        data = stan_data_vote19,
        pars=c(
            "alpha","beta","sigma","sigma_beta"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_vote19,stan_data_vote19,file="posterior_vote19.Rdata")
    
}

# Income Model (v2)

stan_data_income2 <- list(
    W = as.numeric(survey$W8),
    Y = as.numeric(survey$choice),
    L = cbind(as.numeric(survey$a_lever),as.numeric(survey$b_lever)), # tax levers for A and B
    S = survey$choice_sign, # sign of tax change
    X = cbind(rep(1,nrow(survey)),
              is.element(survey$profile_gross_household,c("under £5,000 per year","£5,000 to £9,999 per year","£10,000 to £14,999 per year","£15,000 to £19,999 per year","£20,000 to £24,999 per year")),
              is.element(survey$profile_gross_household,c("£60,000 to £69,999 per year","£70,000 to £99,999 per year","£100,000 to £149,999 per year","£100,000 to £149,999 per year","£150,000 and over")),
              is.element(survey$profile_gross_household,c("Don't know","Prefer not to answer"))
              ),
    N_responses = nrow(survey),
    N_levers = nlevels(survey$a_lever)
)
stan_data_income2$N_covars <- ncol(stan_data_income2$X)


if (new_run_income2) {

    set.seed(R_seed)
    posterior_income2 <- stan(
        model_code = stan_code_X, 
        data = stan_data_income2,
        pars=c(
            "alpha","beta","sigma","sigma_beta"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_income2,stan_data_income2,file="posterior_income2.Rdata")
    
}

# Gender Model

stan_data_gender <- list(
    W = as.numeric(survey$W8),
    Y = as.numeric(survey$choice),
    L = cbind(as.numeric(survey$a_lever),as.numeric(survey$b_lever)), # tax levers for A and B
    S = survey$choice_sign, # sign of tax change
    X = cbind(rep(1,nrow(survey)),
              is.element(survey$profile_gender,c("Female"))),
    N_responses = nrow(survey),
    N_levers = nlevels(survey$a_lever)
)
stan_data_gender$N_covars <- ncol(stan_data_gender$X)


if (new_run_gender) {

    set.seed(R_seed)
    posterior_gender <- stan(
        model_code = stan_code_X, 
        data = stan_data_gender,
        pars=c(
            "alpha","beta","sigma","sigma_beta"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_gender,stan_data_gender,file="posterior_gender.Rdata")
    
}

# University Degree Model

stan_data_degree <- list(
    W = as.numeric(survey$W8),
    Y = as.numeric(survey$choice),
    L = cbind(as.numeric(survey$a_lever),as.numeric(survey$b_lever)), # tax levers for A and B
    S = survey$choice_sign, # sign of tax change
    X = cbind(rep(1,nrow(survey)),
              is.element(survey$profile_education_level,c( "University or CNAA first degree (e.g. BA, B.Sc, B.Ed)","University or CNAA higher degree (e.g. M.Sc, Ph.D)","Other technical, professional or higher qualification"))),
    N_responses = nrow(survey),
    N_levers = nlevels(survey$a_lever)
)
stan_data_degree$N_covars <- ncol(stan_data_degree$X)


if (new_run_degree) {

    set.seed(R_seed)
    posterior_degree <- stan(
        model_code = stan_code_X, 
        data = stan_data_degree,
        pars=c(
            "alpha","beta","sigma","sigma_beta"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_degree,stan_data_degree,file="posterior_degree.Rdata")
    
}

# Multivariate Model

stan_data_multivariate <- list(
    W = as.numeric(survey$W8),
    Y = as.numeric(survey$choice),
    L = cbind(as.numeric(survey$a_lever),as.numeric(survey$b_lever)), # tax levers for A and B
    S = survey$choice_sign, # sign of tax change
    X = cbind(rep(1,nrow(survey)),
              is.element(survey$profile_gross_household,c("£45,000 to £49,999 per year","£50,000 to £59,999 per year","£60,000 to £69,999 per year","£70,000 to £99,999 per year","£100,000 to £149,999 per year","£100,000 to £149,999 per year","£150,000 and over")),
              is.element(survey$profile_gross_household,c("Don't know","Prefer not to answer")),
              is.element(survey$profile_education_level,c( "University or CNAA first degree (e.g. BA, B.Sc, B.Ed)","University or CNAA higher degree (e.g. M.Sc, Ph.D)","Other technical, professional or higher qualification")),
              is.element(survey$profile_gender,c("Female")),
              is.element(survey$pastvote_EURef,c("I voted to Leave")),
              is.element(survey$vote_19,c("Lab")),
              is.element(survey$vote_19,c("LD")),
              is.element(survey$vote_19,c("Other")),
              is.element(survey$vote_19,c("None"))
              ),
    N_responses = nrow(survey),
    N_levers = nlevels(survey$a_lever)
)
stan_data_multivariate$N_covars <- ncol(stan_data_multivariate$X)


if (new_run_multivariate) {

    set.seed(R_seed)
    posterior_multivariate <- stan(
        model_code = stan_code_X, 
        data = stan_data_multivariate,
        pars=c(
            "alpha","beta","sigma","sigma_beta"
        ), 
        seed = stan_seed,
        
        iter = N_warmup + N_iterations,
        warmup = N_warmup, 
        chains = N_chains,
        refresh = 1, save_dso = TRUE,
        control = list(max_treedepth = 10, adapt_delta = 0.95)
    )  
    
    save(posterior_multivariate,stan_data_multivariate,file="posterior_multivariate.Rdata")
    
}

```


# Results: Preferences Over Tax Levers

```{r}

load(file="posterior_null.Rdata")

alpha_est <- mean(extract(posterior_null)$alpha)
alpha_int <- quantile(extract(posterior_null)$alpha,c(0.025,0.975))
    
```


```{r}

load(file="posterior_symmetric.Rdata")

alpha_est <- mean(extract(posterior_symmetric)$alpha)
alpha_int <- quantile(extract(posterior_symmetric)$alpha,c(0.025,0.975))

sigma_beta_est <- mean(extract(posterior_symmetric)$sigma_beta)
sigma_beta_int <- quantile(extract(posterior_symmetric)$sigma_beta,c(0.025,0.975))

beta_est <- apply(extract(posterior_symmetric)$beta,2,mean)
beta_int <- apply(extract(posterior_symmetric)$beta,2,quantile,c(0.025,0.975))
    
```

```{r preferencesymmetric,fig.pos="h",fig.width=8,fig.height=6,fig.cap="Relative public preference for tax levers, in units of probability of supporting taxation via a given lever versus others. \\label{preferencesymmetric}"}


plot(0,0,xlim=c(-0.275,0.275),ylim=c(0.5,stan_data_symmetric$N_levers+1),
     type="n",axes=FALSE,
     xlab="Relative Popularity of Tax Levers",ylab="")
abline(v = 0,col=rgb(0,0,0,0.5),lty=3)
axis(1)
sortorder <- order(beta_est)
points(beta_est[sortorder],1:stan_data_symmetric$N_levers,pch=16)
for (k in 1:stan_data_symmetric$N_levers) {
  lines(beta_int[,sortorder[k]],c(k,k))
  if (beta_est[sortorder[k]] > 0) text(beta_int[1,sortorder[k]],k,long_labels[sortorder[k]],pos=2,cex=0.7) else  text(beta_int[2,sortorder[k]],k,long_labels[sortorder[k]],pos=4,cex=0.7) 
}
```

Figure \ref{preferencesymmetric} shows estimates of the relative preferences for each tax lever (averaging over all comparisons in the experiment).[^reversecode] The differences are substantial. Increasing (or not decreasing) the corporation tax rate is preferred to increasing (or not decreasing) Council Tax by `r round(diff(range(beta_est)),2)`. With a representative level of neutral responses, this corresponds to a population-level response distribution where `r round(25+100*diff(range(beta_est))/2,1)`% of respondents prefer the corporate tax rate increase, and only `r round(25-100*diff(range(beta_est))/2,1)`% prefer the council tax increase. The remaining 50% are indifferent or don't know. From the perspective of political efficiency, the differences across taxes imply that popular reforms to the composition of tax revenues are available.

[^reversecode]: We "reverse code" the tax decrease prompts in this analysis, such that higher estimates correspond to taxes $j$ that are preferred as a source of revenue. See [appendix](#modelspec) for mathematical details.   

```{r}

# Compare raw data for the most extreme pairwise comparison to model estimates

in_extreme_pair <- rowMeans(t(apply(stan_data_symmetric$L,1,is.element,c(which.min(beta_est),which.max(beta_est))))) == 1

Y_in_extreme_pair <- (2*stan_data_symmetric$Y[in_extreme_pair]-1) *
  sign(apply(stan_data_symmetric$L[in_extreme_pair,],1,diff)) * # AB vs BA
  stan_data_symmetric$S[in_extreme_pair] # increase vs decrease

```

Second, the taxes that are most popular are generally progressive: those on higher earners and on capital or corporate incomes. This is consistent with previous research asking about general preferences, but replicates with reference to concrete policy levers. Moreover, while support for these taxes may be economically naive, our design decreases naivety as much as possible. We provided estimates which try to include the behavioural responses to tax changes, and the scale of the required changes to rates reflects the narrow bases of these taxes.^[As another indicator of the lack of explanatory power of naivety for these results, we see no less support for these progressive taxes among the more highly educated.]

## The Hidden Consensus on Taxation

We also examine differences in the popularity of tax levers between types of respondent, characterized by income and partisanship. We discover very little variation by income, and only slightly more by party, in the taxes that British citizens prefer. This consensus may be hidden by divergent views on the overall level of taxation which contaminate simpler designs' estimates of the popularity of particular taxes. 

```{r}

load(file="posterior_income.Rdata")


alpha_est <- mean(extract(posterior_income)$alpha)
alpha_int <- quantile(extract(posterior_income)$alpha,c(0.025,0.975))

sigma_beta_est <- apply(extract(posterior_income)$sigma_beta,c(2),mean)
sigma_beta_int <- apply(extract(posterior_income)$sigma_beta,c(2),quantile,c(0.025,0.975))

beta_est <- apply(extract(posterior_income)$beta,c(2,3),mean)
beta_int <- apply(extract(posterior_income)$beta,c(2,3),quantile,c(0.025,0.975))


under45k_chains <- extract(posterior_income)$beta[,,1]
over45k_chains <- extract(posterior_income)$beta[,,1] + extract(posterior_income)$beta[,,2]
refused_chains <- extract(posterior_income)$beta[,,1] + extract(posterior_income)$beta[,,3]

under45k_est <- apply(under45k_chains,2,mean)
over45k_est <- apply(over45k_chains,2,mean)
refused_est <- apply(refused_chains,2,mean)

under45k_int <- apply(under45k_chains,2,quantile,c(0.025,0.975))
over45k_int <- apply(over45k_chains,2,quantile,c(0.025,0.975))
refused_int <- apply(refused_chains,2,quantile,c(0.025,0.975))

under_over_45k_diff_sig <- rowSums(sign(t(beta_int[,,2]))) != 0
under45k_refused_diff_sig <- rowSums(sign(t(beta_int[,,3]))) != 0

beta_est_table_pretty <- data.frame(
 tax=lever_labels,
 intercept=round(beta_est[,1],2),
 over45k=round(beta_est[,2],2),
 refused=round(beta_est[,3],2)
)
    
```


```{r}

#kable(beta_est_table_pretty,booktabs =T,linesep="")

```


```{r preferencebyIncome,fig.pos="b",fig.width=8,fig.height=6,fig.cap="Relative public preference for tax levers for respondents with household incomes above 45k (blue squares), below 45k (red circles) and those who did not answer the income item (grey triangles), in units of probability of supporting taxation via a given lever versus others. Solid points indicate tax levers where the 95% interval for the difference between those below 45k and the respective other group excludes zero. \\label{preferencebyIncome}"}

plot(0,0,xlim=c(-0.275,0.275),ylim=c(0.5,stan_data_income$N_levers+1),
     type="n",axes=FALSE,
     xlab="Relative Popularity of Tax Levers",ylab="")
abline(v = 0,col=rgb(0,0,0,0.5),lty=3)
axis(1)
sortorder <- order(beta_est[,1] + 0.5*beta_est[,2] + 0.5*beta_est[,3])
points(beta_est[sortorder,1],1:stan_data_income$N_levers,pch=1 + 15*under_over_45k_diff_sig[sortorder] + 15*under45k_refused_diff_sig[sortorder] - 15*under45k_refused_diff_sig[sortorder]*under_over_45k_diff_sig[sortorder],col="red")
points(beta_est[sortorder,1]+beta_est[sortorder,2],1:stan_data_income$N_levers+0.2,pch=15*under_over_45k_diff_sig[sortorder],col="blue")
points(beta_est[sortorder,1]+beta_est[sortorder,3],1:stan_data_income$N_levers-0.2,pch=2+15*under45k_refused_diff_sig[sortorder],col="grey")
for (k in 1:stan_data_income$N_levers) {
  lines(under45k_int[,sortorder[k]],c(k,k),col="red")
  lines(over45k_int[,sortorder[k]],c(k,k)+0.2,col="blue")
  lines(refused_int[,sortorder[k]],c(k,k)-0.2,col="grey")
 if (under45k_est[sortorder[k]] + over45k_est[sortorder[k]] > 0) text(min(under45k_int[1,sortorder[k]],over45k_int[1,sortorder[k]],refused_int[1,sortorder[k]]),k,long_labels[sortorder[k]],pos=2,cex=0.75,col=rgb(0,0,0,0.25+0.75*under_over_45k_diff_sig[sortorder[k]]+0.75*under45k_refused_diff_sig[sortorder[k]]-0.75*under45k_refused_diff_sig[sortorder[k]]*under_over_45k_diff_sig[sortorder[k]])) else  text(max(under45k_int[2,sortorder[k]],over45k_int[2,sortorder[k]],refused_int[1,sortorder[k]]),k,long_labels[sortorder[k]],pos=4,cex=0.75,col=rgb(0,0,0,0.25+0.55*under_over_45k_diff_sig[sortorder[k]]+0.75*under45k_refused_diff_sig[sortorder[k]]-0.75*under45k_refused_diff_sig[sortorder[k]]*under_over_45k_diff_sig[sortorder[k]])) 
}

legend("topleft",legend = c("above 45k","below 45k","not answered"),
       col = c("blue","red","gray"),
       pch = c(15,16,17),
       title = "Household income",
       cex = 0.75, box.lty=0)

```

Figure \ref{preferencebyIncome} shows estimates for respondents with household incomes above and below £45,000,[^incomethreshold] and those who did not give an income response.  Figure \ref{preferencebyvote19} shows estimates for Conservative and Labour voters. In both figures the overall orderings of the taxes are similar across groups, and there are few levers (indicated with solid points on the figures) where there are statistically significant differences in the popularities of individual taxes between groups.

[^incomethreshold]: Of the income response thresholds in the survey data, this was the one closest to median household income in the UK at the time of the survey.  We present an analysis split by approximate income tercile (at £25,000 and £60,000) in the appendix, and the results are very similar.

Only the corporation tax rate and council tax have statistically differentiable levels of popularity by income. Those with incomes over £45,000 see both of these taxes more favourably than those with incomes below £45,000. For corporation tax, this reinforces support for a very popular tax, while the council tax is less unpopular with high-income respondents. There are no significant differences by income for the two higher rates of personal income taxation (the higher and the top rates), nor for the threshold at which the higher rate kicks in.  Higher-income respondents also endorse raising revenue through other progressive taxes (capital gains tax rates, stamp duty, and inheritance taxation) just as strongly as lower-income respondents. Overall, the correlation between the preference estimates for those with incomes under versus over £45,000 is `r round2(cor(beta_est[,1],beta_est[,1] + beta_est[,2]))`.  


```{r}

load(file="posterior_labcon.Rdata")

alpha_est <- mean(extract(posterior_labcon)$alpha)
alpha_int <- quantile(extract(posterior_labcon)$alpha,c(0.025,0.975))

sigma_beta_est <- apply(extract(posterior_labcon)$sigma_beta,c(2),mean)
sigma_beta_int <- apply(extract(posterior_labcon)$sigma_beta,c(2),quantile,c(0.025,0.975))

beta_est <- apply(extract(posterior_labcon)$beta,c(2,3),mean)
beta_int <- apply(extract(posterior_labcon)$beta,c(2,3),quantile,c(0.025,0.975))

lab_chains <- extract(posterior_labcon)$beta[,,1]
con_chains <- extract(posterior_labcon)$beta[,,1] + extract(posterior_labcon)$beta[,,2]

con_est <- apply(con_chains,2,mean)
lab_est <- apply(lab_chains,2,mean)
con_int <- apply(con_chains,2,quantile,c(0.025,0.975))
lab_int <- apply(lab_chains,2,quantile,c(0.025,0.975))
con_lab_diff_sig <- rowSums(sign(t(beta_int[,,2]))) != 0

beta_est_table_pretty <- data.frame(
 tax=lever_labels,
 intercept=round(beta_est[,1],2),
 con=round(beta_est[,2],2)
)
    
```

There are more taxes where partisan differences can be found, but again, the headline picture is of consensus. Labour voters are more supportive than Conservative of higher rates of personal income tax on the highest earners, and of raising revenue through inheritance and fuel taxation. Conservative voters are more supportive of three of the eight possible changes to social insurance contributions.

These social insurance differences deserve some comment. The Conservative UK government had just announced changes to this tax when the experiment was fielded.^[See \url{https://theconversation.com/autumn-budget-2021-experts-react-170741}.] These comprised slight cuts to revenue via adjustments to tax-free allowances.  Meanwhile, substantial increases in _rates_ for employees and the self-employed increased revenue. In our data, one of these three rates (the main rate for employees) and two of the thresholds are more popular among Conservatives. While Conservative voters do not quite endorse the precise enacted changes, it seems plausible that the partisan patterns could reflect short-term effects rather than durable preference cleavages.


```{r preferencebyvote19,fig.pos="t",fig.width=8,fig.height=6,fig.cap="Relative public preference for tax levers for 2019 Conservative (blue squares) versus 2019 Labour (red circles) voters, in units of probability of supporting taxation via a given lever versus others. Solid points indicate tax levers where the 95% interval for the party difference excludes zero. \\label{preferencebyvote19}"}

plot(0,0,xlim=c(-0.275,0.275),ylim=c(0.5,stan_data_symmetric$N_levers+1),
     type="n",axes=FALSE,
     xlab="Relative Popularity of Tax Levers",ylab="")
abline(v = 0,col=rgb(0,0,0,0.5),lty=3)
axis(1)
sortorder <- order(beta_est[,1] + 0.5*beta_est[,2])
points(beta_est[sortorder,1],1:stan_data_symmetric$N_levers-0.1,pch=1 + 15*con_lab_diff_sig[sortorder],col=lab_col)
points(beta_est[sortorder,1]+beta_est[sortorder,2],1:stan_data_symmetric$N_levers+0.1,pch=15*con_lab_diff_sig[sortorder],col=con_col)
for (k in 1:stan_data_symmetric$N_levers) {
  lines(lab_int[,sortorder[k]],c(k,k)-0.1,col=lab_col)
  lines(con_int[,sortorder[k]],c(k,k)+0.1,col=con_col)
 if (lab_est[sortorder[k]] + con_est[sortorder[k]] > 0) text(min(lab_int[1,sortorder[k]],con_int[1,sortorder[k]]),k,long_labels[sortorder[k]],pos=2,cex=0.75,col=rgb(0,0,0,0.25+0.75*con_lab_diff_sig[sortorder[k]])) else  text(max(lab_int[2,sortorder[k]],con_int[2,sortorder[k]]),k,long_labels[sortorder[k]],pos=4,cex=0.75,col=rgb(0,0,0,0.25+0.55*con_lab_diff_sig[sortorder[k]])) 
}
  
legend("topleft",legend = c("Conservative","Labour"),
       col = c(con_col,lab_col),
       pch = c(15,16),
       title = "General Election vote",
       cex = 0.75, box.lty=0, title.adj=0)
```


<!--
```{r,fig.pos="t",fig.width=6,fig.height=6,fig.cap="Relative popularity of a given tax among Conservative 2019 voters as a function of the relative popularity of a given tax among Labour 2019 voters"}

plot(beta_est[,1],beta_est[,1] + beta_est[,2],
     xlab="Relative Popularity of Tax - Lab 2019 Voters",ylab="Relative Popularity of Tax - Con 2019 Voters",
     xlim=c(-0.25,0.25),ylim=c(-0.25,0.25),pch=16)
abline(h=0,col=rgb(0,0,0,0.5))
abline(v=0,col=rgb(0,0,0,0.5))
abline(a=0,b=1,col=rgb(0,0,0,0.5))
text(beta_est[,1],beta_est[,1] + beta_est[,2],long_labels,pos=2 + 2*(beta_est[,2] < 0),
    cex=0.4)

#for (k in 1:stan_data_symmetric$N_levers) {
#  lines(rep(nu_est[1,k],2),nu_int[,2,k],col=rgb(0,0,0,0.5))
#  lines(nu_int[,1,k],rep(nu_est[2,k],2),col=rgb(0,0,0,0.5))
#}

```
-->

Even with this immediate pre-experiment shock to attitudes, partisan differences are not very large when considered across all levers. The correlation between the preference estimates for Labour vs Conservative voters is `r round2(cor(beta_est[,1],beta_est[,1] + beta_est[,2]))`. This consensus is surprising in light of the comparative literature on the tax mix which grounds partisan differences in the divergent interests of different parties' constituents.[^furtherconsensus]

[^furtherconsensus]: We explored further variation by EU Referendum vote, by 2019 turnout, by 2019 vote including all parties, by gender and by education in the [appendix](#othercovariates). None of the sets of estimates showed any particularly systematic differences in preferences either, providing further evidence for a 'hidden consensus'.

An alternative interpretation of these patterns in the data is not consensus, but incomprehension, or a lack of engagement. That is, sceptics may argue that our survey respondents are not really giving us meaningful responses to the choices we give them. We disagree with dismissing the consensus we see here on this basis for three reasons. First, we do see substantively large differences, in the aggregate, in the relative popularities of the taxes. Second, there are consistent differences between taxes within partisan groups: that is, it is hard to explain away the fact that (for example) the popularity advantage of the corporation tax over taxes on property transactions is the same among Conservative and Labour supporters. Finally, equally complex surveys on other topics -- such as the allocation of government spending -- do reveal strong partisan divisions [@barnes2022measuring]. Overall, then, that there are differences in the popularity of different taxes, but that the patterns of variation differ little across types of respondent, points more to consensus than to a lack of substantive engagement with the task.

## Robustness

Our results are robust to a number of other experimental variations (reported in the [appendix](#robustnessfigs)). First, we model choices over increases separately from decreases to gauge the appropriateness of our underlying idea of a general popularity driving choices on both kinds of choice. Second, we consider much larger changes -- £10 billion, instead of £1 billion -- for the 'big five' taxes with which it is plausible to raise that much revenue. Finally, we consider choices made when we provide additional arguments for or against both options, as a check on the sensitivity of our results to differences in presentation. For all three of these variations, there is little evidence of any substantial difference from our main results. 

## Generalizability

How idiosyncratic is the result that there are popular, revenue-neutral tax reforms available, relative to the politically efficient tax mix? There may be some theoretical reasons to expect low responsiveness of policy to public opinion in Britain  [@hobolt2008government], making the gap between preferences and the status quo tax system that we discover unusual. But more recent data show little variation across countries, with the UK even among the more responsive [@rasmussen2018opinion]. Taking taxation more specifically, politicians setting tax policy in Britain have relatively high levels of insulation [@steinmo1993taxation], but this cuts two ways: it limits direct public influence, but politicians (compared to tax experts or civil servants) are the policy actors most likely to be sensitive to public preferences.

On the popularity ranking of taxes, we cannot draw conclusions about whether the source of (relative) popularity lies in specific features of Britain's implementation of particular taxes, or in broader characteristics shared by these taxes across countries. However, with the possible exceptions of property taxes (Council and Stamp Duty Land Tax), <!-- Ben added SDLT here --> most UK taxes are not particularly unusual in comparative perspective. Moreover, while our experiment makes this limitation very obvious, it is not unique to our design. In broader cross-national studies, or more general question wordings, we also do not know if respondents are reacting to their experience of country-specific particularities. 

The obvious extension, to fill these gaps, is to field appropriately domesticated equivalent surveys in other countries, yielding cross-national evidence on preferences over concrete policies. Researchers could then consider which underlying theoretical characteristics (progressivity, visibility) are associated with support for different tax mixes as a useful complement to asking respondents their views on these characteristics directly.

A more consequential limitation of the generality of our methodology is that the design is difficult to extend beyond actually-existing taxes. This precludes the examination of, for example, a well-designed wealth tax, or a flat tax on income.  However, there are offsetting gains in terms of the practicability of the proposed reforms (and thus the policy utility of our results), as well as the relative familiarity and credibility of the proposals to respondents. 

# Conclusion

<!--This paper does not directly answer questions as to why mismatches between public preferences over taxes and the tax system in the UK exist.  Moreover, it remains unknown whether similar mismatches exist in different countries where policy-makers face different institutional constraints, or may hold different beliefs about the incidence of tax reforms. Nonetheless, our paper makes an important contribution by being one of the first of such explorations of the 'hissing frontier' of taxation and by making some headway in "the search for a popular tax".  An important step towards better understanding the political tensions that are being balanced in tax system design is to document the tax levers on which there is currently most force being applied by the public. -->

We use experimental control to identify preferences over specific tax parameters in isolation from  accompanying revenue changes which otherwise make the measurement of preferences about tax composition very difficult.  We rely on respondents' ability to make comparisons between concrete proposals -- such that they need not articulate a full preference ordering, nor the details of what they like or dislike about specific taxes -- which is a more feasible task in a highly technical area. The revenue-equivalent changes bring the policy choice much closer to politicians' (or Treasury civil servants') tax policy problem. 

We thereby identify the levers that might be involved in politically viable tax reform in the UK, minimising public dissatisfaction with taxation for a given revenue level, and show that the existing composition of UK taxation is far from optimising the revenue-discontent tradeoff. Specifically, increasing taxes on corporations, higher income tax payers, capital gains, and alcohol and tobacco is likely to be less politically painful than other increases. To the extent that tax cuts can be found, they will be most popular if broadly distributed, and targeted to the lower end of the income tax. Equally, two of the UK taxes widely regarded as dysfunctional by policy experts and economists, Council Tax and National Insurance, are also disliked by the general population. Communicated with appropriate reference to the real revenue trade-offs, their reform should be politically feasible. Given the partisan (and socio-demographic) consensus over the tax mix, these aggregate patterns do not mask major electoral cleavages blocking this kind of reform. 

<!--
We cannot go so far as to say that tax reform is a vote-winning platform, but our results show that not all taxes are equal in terms of their popularity with the public. Thus tax \emph{cuts} can be actively vote-losing. Liz Truss and Kwasi Kwarteng learned an extreme version of this lesson. Their proposed large-scale cuts to taxation in September 2022 were immediately unpopular. With headline cuts to corporation tax, the additional rate of income tax, and reversing a planned rise in alcohol duty, our experiment predicted this unpopularity.  And, even before the financial market response was clear, when these individual tax proposals were polled immediately after their announcement [@TimesYouGovPollSeptember2022], just 11% thought abolishing the additional rate was a "good idea".--> <!--versus 71% who though it was the "wrong priority" and the corresponding figures were 31% and 48% on cancelling a planned increase in corporation tax.  The additional rate elimination was only thought to be a good idea by 18% of Conservative 2019 voters, despite being just proposed by a Conservative government.  While other components of the package were individually popular,--> <!-- Only 19% thought that the overall package was "fair", versus 57% who did not, and these tax proposals were scrapped as Truss unsuccessfully attempted to save her brief premiership from the ensuing political and economic storm.

In theoretical terms, the consensus over the preferred composition of taxation indicates a lack of purchase for theories grounded in material interest or in partisanship to explain preferences over the tax mix. More generally, explanations that emphasize individual-level differences in responses to taxation face important limitations, as our mapping of public opinion describes a broad consensus across political and demographic groups. This is most plausibly underpinned by shared views on what fair taxation looks like. As in the long-run history of UK taxation, an ability to pay norm seems most consistent with the characteristics of popular versus unpopular taxes [@daunton2002just]. 

Our design was intended primarily to provide an accurate descriptive mapping of preferences over tax revenue composition, sensitive to the actual revenue-raising power of each tax lever, rather than explaining these preferences. As such, more deliberate explanatory empirical investigations -- including verifying whether the fairness norms that we point to here really influence tax mix preferences -- are an obvious next step for research. But our contribution here is nevertheless significant: we should not try to explain what we cannot accurately describe. We also demonstrate a method for presenting respondents with choices on taxation which isolates the question of the allocation of the tax burden across taxes from the more general -- and more contested -- question of the appropriate level of taxation overall, and also directly incorporates the revenue tradeoffs involved in setting tax policy. While this kind of approach is becoming more common in the study of spending allocations, its absence from the tax side until now has limited our understanding of public opinion on taxation. 

The fact that we observe substantial variation in the popularity of different tax levers at the margin of current tax policy illustrates that the current tax rate equilibrium in the UK is not solely determined by the forces of public opinion.  In these deviations, we can see possible countervailing forces that might cause the political equilibrium for the tax mix to diverge from the mix that would maximise public satisfaction.  First, the substantive tax-mix preferences of those with the power to set policy may differ from public preferences, whether because they have more awareness of relevant economic constraints or because they are personally unrepresentative.  For example, high rates of corporate taxes can shift some corporate activities out of, while low rates can shift corporate activities into a country. These considerations may not be fully internalised by the general public in the short run.  Second, many of the levers which we identify as popular -- relative to the status quo -- can be characterised as having concentrated groups who would be bear the costs of increases [@kemmerling2021domestic, pp.84-88]. Corporations for CT rate, alcohol and tobacco producers and pubs for A&T duties, and affluent people in general for the higher and additional rates of Personal Income Tax, Capital Gains Tax, and Inheritance Tax levers.  Our focus on the popularity of these levers at the margin of current tax policy means that we can note the locations where the current tax equilibrium is not optimal in terms of public opinion, and thus where there are countervailing political forces to be identified, but documenting these countervailing forces is a subject for a different kind of empirical study than this one. -->

Our approach in this paper uses the actually-existing tax system as its starting point, asking questions (only) about concrete potential modifications.  A far more challenging problem would be to attempt to characterize public attitudes away from the current margin. The concrete details required also make any implementation of measuring such tax mix preferences parochial: our measurement tool could be 'domesticated' to other tax systems, but we would only be able to learn that VAT in Germany is more (or less) relatively popular compared to the actually-existing German income tax system, and not about whether Germans or Brits are more predisposed to favour sales taxes in the abstract. Nevertheless, replicating the comprehensive approach to attitudes to a broad universe of tax levers in different countries would vastly increase our understanding of attitudes towards taxation by taking preferences over tax composition seriously. 

<!-- Something something other countries, meaning of life. Fin. -->




\pagebreak 


```{=latex}
\begin{landscape}
```
# Appendix

## Table of Tax Levers {#treatments}

```{r}
# library(kableExtra)
# tmp <-  unique(data.frame("a"=paste0(treatments$lever," (",treatments$revenueDirection," ",treatments$revenueSize,")"),
#                           "b"= paste(treatments$taxDescription,treatments$complementSQ,treatments$leverSQ),
#                           "c"= paste0(treatments$statementOfChange, " would ",
#                                       treatments$revenueDirection," tax revenue by ",treatments$revenueSize," per year.")))
# colnames(tmp) <- c("Tax lever","Description of status quo","Statement of change")
# rownames(tmp) <- NULL
# 
# kable(tmp, booktabs =T,longtable = TRUE,linesep="") %>% 
#   kable_styling(latex_options = c("striped","HOLD_position","repeat_header"),font_size = 8) %>%
#   column_spec(1, width = "3cm") %>% column_spec(2, width = "8cm") %>% column_spec(3, width = "4cm")

```



```{r}
library(kableExtra)
library(dplyr)
tmp <-  unique(data.frame("a"=treatments$lever,"b"=as.character(treatments$lever_long),
                          "c"= paste(treatments$taxDescription,treatments$complementSQ,treatments$leverSQ),
                          "d"= paste0(treatments$statementOfChange, " would ",
                                      treatments$revenueDirection," tax revenue by ",treatments$revenueSize," per year."),
                          "v" = treatments$revenueDirection,
                          "r" = treatments$revenueSize)) %>%
  tidyr::pivot_wider(values_from = d, names_from = v) %>%
  dplyr::group_by(a,b,c) %>% dplyr::mutate(r = dplyr::n()) %>% dplyr::ungroup() 
ind <- tmp %>% mutate(n = row_number()) %>% select(a,r,n) %>% reframe(n = list(n),.by=a) %>% 
  arrange(n) %>% slice(seq(1,23,2)) %>% tidyr::unnest()
ind2 <- tmp %>% mutate(n = row_number()) %>% select(a,r,n) %>% reframe(n = max(n),.by=a) %>% 
  arrange(n) 
ind3 <- setdiff(1:28,ind2$n)
tmp <- tmp %>%
  dplyr::mutate(
    b = stringr::str_replace_all(b,"SI","Social insurance"),
    a = ifelse(r==2,paste0("\\multirow[t]{2}{2cm}{",a,"}"),a),
    b = ifelse(r==2,paste0("\\multirow[t]{2}{2cm}{",b,"}"),b),
    c = ifelse(r==2,paste0("\\multirow[t]{2}{7cm}{",c,"}"),c),
    across(everything(), ~stringr::str_replace_all(.,"&","\\\\&")),
    across(everything(), ~stringr::str_replace_all(.,"%","\\\\%")),
    across(c(a,b,c), ~ ifelse(duplicated(.)," ",.))) %>% select(-r)
colnames(tmp) <- c("Tax lever (short)","Tax lever (long)","Description of status quo","Statement of change to increase revenue","Statement of change to cut revenue")
rownames(tmp) <- NULL

kable(tmp, booktabs =T,longtable = TRUE,linesep="",format = "latex",escape = F) %>% 
  kable_styling(latex_options = c("striped","HOLD_position","repeat_header"),font_size = 6.5,stripe_index = ind$n) %>%
  column_spec(1, width = "2cm") %>% column_spec(2, width = "2cm") %>%
  column_spec(3, width = "7cm") %>% column_spec(4, width = "5cm")  %>% column_spec(5, width = "5cm")  %>%
  row_spec(ind3,extra_latex_after = "\\cline{4-5}") %>%
  row_spec(ind2$n[-23],extra_latex_after = "\\hline")

```

```{=latex}
\end{landscape}
```

\pagebreak

## Statistics on Respondent Attention {#attentionchecks}

### Response Time by Response Category   

```{r}
# Response time (in seconds) by answer
tmp <- aggregate(page_p_exp3_timing ~ q1_exp3,data=survey, function(x) round(median(x),2)) 
colnames(tmp) <- c("Answer","Median Response Time (in seconds)")
tmp$Answer <- as.character(tmp$Answer)

tmp$Answer[tmp$Answer == "I think both of these changes are equally good or bad"] <- "Neutral"

kable(tmp, booktabs =T, align = "lrrrr") %>%  
  kable_styling(latex_options = c("HOLD_position")) 

```


### Response Time and Share of Neutral Responses by Tax Lever

```{r}
tmp <- survey[,c("page_p_exp3_timing","q1_exp3","a_lever","b_lever")]
tmp <- tidyr::pivot_longer(tmp,cols = 3:4)

# Response time (in seconds) by tax lever
tmp1 <- aggregate(page_p_exp3_timing ~ value, data=tmp, function(x) round(median(x),2)) 
colnames(tmp1) <- c("Tax Lever","All responses")
rownames(tmp1) <-NULL

# Excluding don't know responses
tmpxdk <- survey[which(survey$q1_exp3!= "Don't know") ,c("page_p_exp3_timing","q1_exp3","a_lever","b_lever")]
tmpxdk <- tidyr::pivot_longer(tmpxdk,cols = 3:4)

# Response time (in seconds) by tax lever
tmp1x <- aggregate(page_p_exp3_timing ~ value, data=tmpxdk, function(x) round(median(x),2)) 
colnames(tmp1x) <- c("Tax Lever","Excluding DK's")
rownames(tmp1x) <-NULL


# Share of Neutrals and Don't Know's by tax lever
tmp2 <- tibble::rownames_to_column(
  as.data.frame.matrix(
    round(prop.table(table(tmp$value,tmp$q1_exp3),1),2))[,3:4])
colnames(tmp2) <- c("Tax Lever","Neutral","Don't know")

tmp <- dplyr::left_join(tmp1,tmp1x)
tmp <- dplyr::left_join(tmp, tmp2)
kable(tmp, booktabs =T,linesep="") %>% 
  kable_styling(latex_options = c("striped","HOLD_position")) %>%
  add_header_above(c("", "Median response time (seconds)"=2,"Share of ..."=2)) %>%
  column_spec(3,border_right = T)


```


### Relationship between Don't Knows and Neutral Responses by Tax Lever

```{r dkNeutralLever,fig.pos="t",fig.width=6,fig.height=6,fig.cap="Share of neutral and don't know responses by tax lever.\\label{dkNeutralLever}"}
tmp <- survey[,c("page_p_exp3_timing","q1_exp3","a_lever","b_lever")]
tmp <- tidyr::pivot_longer(tmp,cols = 3:4)

# Response time (in seconds) by tax lever
tmp1 <- aggregate(page_p_exp3_timing ~ value, data=tmp, function(x) round(median(x),2)) 
colnames(tmp1) <- c("Tax Lever","Median Response Time (in seconds)")
rownames(tmp1) <-NULL


# Share of Neutrals and Don't Know's by tax lever
tmp2 <- tibble::rownames_to_column(
  as.data.frame.matrix(
    round(prop.table(table(tmp$value,tmp$q1_exp3),1),2))[,3:4])
colnames(tmp2) <- c("Tax Lever","Neutral","Don't know")

tmp <- dplyr::left_join(tmp1,tmp2)
tmp$pl = c("A&T", "CGT", "Council", "Corp.", "Fuel", "IHT", "IHT \nthreshold", "NI: \nmain", "NI: \nhigh wage", "NI: \nemployers", "NI: \nself-emp", "NI \nlower \nthreshold", "NI: \n upper \nthreshold", "NI: \nemployer \nthreshold", "NI: \nself-emp \nlower thresh.", "PIT: \nadditional", "PIT: \nbasic", "PIT \nhigher", "PIT: higher \nthreshold", "PIT: lower \nthreshold", "SDLT", "SDLT \nthreshold", "VAT")
library(ggthemes)
ggplot(tmp, mapping = aes(x=Neutral, y=`Don't know`,label = pl))+
  geom_point()+ #geom_text(hjust=0.02, vjust=0.02, size = 2)
  geom_smooth(method='lm')+
  geom_label_repel(box.padding   = 0.35, 
                   point.padding = 0.5,
                   segment.color = 'grey50') +
  scale_y_continuous(labels = scales::label_percent(accuracy=0.1)) +
  scale_x_continuous(labels = scales::label_percent(accuracy=0.1)) +
  theme_clean() +
  theme(plot.background = element_rect(colour=NA),
        panel.grid.major.x = element_line(color = "lightgray",linetype = "dotted"))


#rtype <- as.character(survey$q1_exp3)
#rtype[which(survey$q1_exp3 == "Option A" | survey$q1_exp3 == "Option B")] <- "Substantive"
#rtype[which(survey$q1_exp3 == "I think both of these changes are equally good or bad")] <- "Neutral"
#rtype <- as.factor(rtype)
#rtype <- relevel(rtype, ref = "Substantive")

#xtmp = tibble(rtype,
#          is.element(survey$profile_education_level,c( "University or CNAA first degree (e.g. BA, B.Sc, B.Ed)","University or CNAA higher degree (e.g. M.Sc, Ph.D)","Other technical, professional or higher qualification")),
#          as.numeric(levels(survey$age))[survey$age],
#          survey$profile_gender
#) 
#names(xtmp)<- c("Response", "Degree", "Age", "Female")

#rtype.demog <- (multinom(Response ~ Degree + Female + Age, data = xtmp))
#plot_model(rtype.demog, transform = NULL)

```

\pagebreak



## Model Specification, Identification, and Estimation {#modelspec}


### Specification

Each respondent $i$ makes a choice between two alternative two proposals $j \in A, B$, with an option to to give a neutral response if they are not sure or view both alternatives as equally attractive/unattractive.

- $Y_i = 1$ if Respondent prefers A
- $Y_i = 0.5$ if Respondent gives neutral response
- $Y_i = 0$ if Respondent prefers B

Following a generalized Bradley-Terry model framework, we model the expected value of $Y_i$ as a function of the competing "popularities" $\pi_{j}$ of different tax change proposals $j$.  With proposals $A$ and $B$, this can be written formally as:  $$E\left[Y_i\right] = \alpha + \pi_{iA} - \pi_{iB}$$ 
where $\alpha$ is the expected value of $Y_i$ when the two proposals are equally popular, i.e. if $\pi_{iA} = \pi_{iB}$.^[$\alpha$ can be thought as the order effect 'advantage' of a proposal being presented as option $A$ vs option $B$, irrespective of their content.  If $\alpha=0.5$, there is no advantage.]

Within this framework, we can specify the popularities $\pi_{ij}$ as a function $f(X_i,Z_j)$ of the experimentally varied features of the proposals $Z_j$, and observational characteristics of the respondents $X_i$. This yields a probability-scale model where additive forms of $f(X_i,Z_j)$ can be interpreted as the additive effects on the net support for a proposal with a given feature versus an alternative feature, or for one group of respondents relative to another group, averaging over the opposing proposals. The difference between $\pi_{iA}$ and $\pi_{iB}$ is the predicted difference between the proportion of respondents preferring $A$ over $B$ and the share of those preferring $B$ over $A$.[^ldvalternative]  

[^ldvalternative]: Because the modelled probabilities are not close to 0 or 1 for any $A$ or $B$, the results are not sensitive to this choice of a linear functional form.  Similar results can be obtained using an ordered logistic/probit framework with equivalent specifications of the deterministic component.

Many of our models additionally involve a variable $S_i$ which describes the sign of the proposed tax change:

- $S_i = 1$ if prompt describes a choice between tax increases
- $S_i = -1$ if prompt describes a choice between tax cuts

Models that incorporate $S_i$ in different ways enable us to either (a) combine responses from choices over increases and choices over cuts to estimate which tax levers the respondent would generally prefer to use to raise marginal revenue or (b) to disaggregate responses from choices over increases and choices over cuts to consider possible patterns of asymmetry in how respondents would prefer to raise marginal revenue.

Our initial analysis defines $\pi_{ij} = S_i \nu_j$ where $S_i = 1$ for tax increase prompts and $-1$ for decrease prompts, pooling our data such that greater values of $\nu_j$ correspond to taxes $j$ that tend to be preferred as a source of revenue.  The model presented in Figure \ref{preferencesymmetric} plots $\nu_j$ parameter for each tax lever $j$ estimated using the model equation: $$E\left[Y_i\right] = \alpha + S_i \nu_A - S_i \nu_B$$ under the identification assumption that $\nu_j \sim N(0,\sigma)$, where $\sigma$ is the estimated standard deviation of the lever popularities around their mean.

The models presented in Figures \ref{preferencebyIncome} and \ref{preferencebyvote19}, plot $\nu_j$ parameter for each tax lever $j$ estimated using the model equation: $$E\left[Y_i\right] = \alpha + S_i \left(\beta_A X_i\right) - S_i \left(\beta_B X_i\right)$$ where we estimate a vector of $\beta_j$ per tax lever and define $X_i$ matrices that have an intercept (column of ones) plus some number of features $k$ of the respondent giving response $i$. We regularize the coefficients with a normal prior $\beta_{jk} \sim N\left(0,\sigma_k\right)$ that shrinks all tax-specific coefficients towards zero according to their common variance by feature $k$.  This avoids spuriously large differences due to limited samples and the number of comparisons being considered.  

We use this same model setup for the analyses presented in appendix figures.  In the figure comparing preferences in tax increase versus tax decrease prompts, we use $S_i$ as our $X_i$ variable, which creates an interaction between levers and the tax change direction, yielding separate estimates for both tax change direction for each lever.

### Identification

By assuming that $\nu_j \sim N(0,\sigma)$, we set the zero point for our interval-level quantity of interest as the average of the popularities for the tax levers we tested.  As noted in the main text, this kind of experimental design cannot yield estimates of absolute popularity of tax levers.  Our identification restriction here is analogous to the one used in "random effects" models, as opposed to the "fixed effects" restriction of setting a single level to zero and estimating all others relative to that one.  Thus, the interval estimates in our figures should be understood as describing uncertainty about a given lever relative to the average level, which is presented as a dotted vertical line in each plot.

### Estimation

We estimate our models using Stan [@stan], with full code available in our replication package.

\pagebreak


## Robustness Checks {#robustnessfigs}


### Preferences over Tax Increases Versus Decreases {#cutsvsincreases}

```{r}

load(file="posterior_asymmetric.Rdata")


alpha_est <- mean(extract(posterior_asymmetric)$alpha)
alpha_int <- quantile(extract(posterior_asymmetric)$alpha,c(0.025,0.975))

sigma_beta_est <- apply(extract(posterior_asymmetric)$sigma_beta,c(2),mean)
sigma_beta_int <- apply(extract(posterior_asymmetric)$sigma_beta,c(2),quantile,c(0.025,0.975))

beta_est <- apply(extract(posterior_asymmetric)$beta,c(2,3),mean)
beta_int <- apply(extract(posterior_asymmetric)$beta,c(2,3),quantile,c(0.025,0.975))
beta_2_sig <- rowSums(sign(t(beta_int[,,2]))) != 0

beta_est_table_pretty <- data.frame(
 tax=lever_labels,
 intercept=round(beta_est[,1],2),
 increase=round(beta_est[,2],2)
)
    
```


```{r, fig.pos="t", fig.width=6,fig.height=6,fig.cap="Relative popularity of a given tax in tax increase prompts as a function of the relative popularity of the same tax in tax cut prompts. Text labels provided for tax levers where 95% intervals for the differences exclude zero."}

plot(beta_est[,1]-beta_est[,2],beta_est[,1]+beta_est[,2],
     xlab="Relative Popularity of Tax in Cut Prompt",ylab="Relative Popularity of Tax in Increase Prompt",
     xlim=c(-0.25,0.25),ylim=c(-0.25,0.25),pch=16)
abline(h=0,col=rgb(0,0,0,0.5))
abline(v=0,col=rgb(0,0,0,0.5))
abline(a=0,b=1,col=rgb(0,0,0,0.5))
 text(beta_est[beta_2_sig,1]-beta_est[beta_2_sig,2],
      beta_est[beta_2_sig,1]+beta_est[beta_2_sig,2],
      long_labels[beta_2_sig],
      pos=2 + 2*(beta_est[beta_2_sig,2] < 0),
     cex=0.75)

#for (k in 1:stan_data_symmetric$N_levers) {
#  lines(rep(nu_est[1,k],2),nu_int[,2,k],col=rgb(0,0,0,0.5))
#  lines(nu_int[,1,k],rep(nu_est[2,k],2),col=rgb(0,0,0,0.5))
#}

```

\pagebreak

### Preferences for Larger versus Smaller Tax Changes {#biglittle}


```{r}
load(file="posterior_biglittle.Rdata")


alpha_est <- mean(extract(posterior_biglittle)$alpha)
alpha_int <- quantile(extract(posterior_biglittle)$alpha,c(0.025,0.975))

sigma_beta_est <- apply(extract(posterior_biglittle)$sigma_beta,c(2),mean)
sigma_beta_int <- apply(extract(posterior_biglittle)$sigma_beta,c(2),quantile,c(0.025,0.975))

beta_est <- apply(extract(posterior_biglittle)$beta,c(2,3),mean)
beta_int <- apply(extract(posterior_biglittle)$beta,c(2,3),quantile,c(0.025,0.975))
```


```{r,fig.pos="t",fig.width=6,fig.height=6,fig.cap="Relative popularity of changing a given tax lever in a given direction, to change revenue by £1 billion (x-axis) versus £10 billion (y-axis). There are no tax levers where 95% intervals for the differences exclude zero."}

plot(c(beta_est[,1],beta_est[,3]),c(beta_est[,1],beta_est[,3])+c(beta_est[,2],beta_est[,4]),
     xlab="Relative Popularity of Tax - £1 billion",ylab="Relative Popularity of Tax - £10 billion",
     xlim=c(-0.25,0.25),ylim=c(-0.25,0.25),pch=16)
abline(h=0,col=rgb(0,0,0,0.5))
abline(v=0,col=rgb(0,0,0,0.5))
abline(a=0,b=1,col=rgb(0,0,0,0.5))

```

\pagebreak


### Sensitivity to Arguments {#argumenteffects}


```{r}

load(file="posterior_procon.Rdata")

alpha_est <- mean(extract(posterior_procon)$alpha)
alpha_int <- quantile(extract(posterior_procon)$alpha,c(0.025,0.975))

sigma_beta_est <- apply(extract(posterior_procon)$sigma_beta,c(2),mean)
sigma_beta_int <- apply(extract(posterior_procon)$sigma_beta,c(2),quantile,c(0.025,0.975))

beta_est <- apply(extract(posterior_procon)$beta,c(2,3),mean)
beta_int <- apply(extract(posterior_procon)$beta,c(2,3),quantile,c(0.025,0.975))
beta_2_sig_pro <- c(rowSums(sign(t(beta_int[,,2]))) != 0, rowSums(sign(t(beta_int[,,5]))) != 0)
beta_2_sig_con <- c(rowSums(sign(t(beta_int[,,3]))) != 0, rowSums(sign(t(beta_int[,,6]))) != 0)
    
```



```{r,fig.pos="t",fig.width=6,fig.height=6,fig.cap="Relative popularity of changing a given tax lever in a given direction, in the baseline condition (x-axis) versus with pro or con argument texts provided (y-axis). Text labels provided for tax levers where 95% intervals for the differences exclude zero."}

plot(c(beta_est[,1],beta_est[,4]),c(beta_est[,1],beta_est[,4])+c(beta_est[,2],beta_est[,5]),
     xlab="Relative Popularity of Tax - No Argument",ylab="Relative Popularity of Tax - With Argument",
     xlim=c(-0.25,0.25),ylim=c(-0.25,0.25),pch=16)
points(c(beta_est[,1],beta_est[,4]),c(beta_est[,1],beta_est[,4])+c(beta_est[,3],beta_est[,6]),pch=16)
abline(h=0,col=rgb(0,0,0,0.5))
abline(v=0,col=rgb(0,0,0,0.5))
abline(a=0,b=1,col=rgb(0,0,0,0.5))
 text(c(beta_est[,1],beta_est[,4])[beta_2_sig_pro],
      c(c(beta_est[,1],beta_est[,4])+c(beta_est[,2],beta_est[,5]))[beta_2_sig_pro],
      paste0(long_labels[beta_2_sig_pro],"\n(pro-argument; ",rep(c("cut","increase"),each=13)[beta_2_sig_pro],")"),
      pos=2 + 2*(c(beta_est[,2],beta_est[,5])[beta_2_sig_pro] < 0),
     cex=0.75)

```


\pagebreak


## Estimated Preference by Covariates {#othercovariates}

In this appendix, we report estimates examining tax lever preferences by EU referendum vote, 2019 general election turnout, gender, income and degree status.

### Preferences by EU Referendum Vote

```{r}

load(file="posterior_leaveremain.Rdata")

alpha_est <- mean(extract(posterior_leaveremain)$alpha)
alpha_int <- quantile(extract(posterior_leaveremain)$alpha,c(0.025,0.975))

sigma_beta_est <- apply(extract(posterior_leaveremain)$sigma_beta,c(2),mean)
sigma_beta_int <- apply(extract(posterior_leaveremain)$sigma_beta,c(2),quantile,c(0.025,0.975))

beta_est <- apply(extract(posterior_leaveremain)$beta,c(2,3),mean)
beta_int <- apply(extract(posterior_leaveremain)$beta,c(2,3),quantile,c(0.025,0.975))

remain_chains <- extract(posterior_leaveremain)$beta[,,1]
leave_chains <- extract(posterior_leaveremain)$beta[,,1] + extract(posterior_leaveremain)$beta[,,2]

remain_est <- apply(remain_chains,2,mean)
leave_est <- apply(leave_chains,2,mean)
remain_int <- apply(remain_chains,2,quantile,c(0.025,0.975))
leave_int <- apply(leave_chains,2,quantile,c(0.025,0.975))
remain_leave_diff_sig <- rowSums(sign(t(beta_int[,,2]))) != 0

beta_est_table_pretty <- data.frame(
 tax=lever_labels,
 intercept=round(beta_est[,1],2),
 leave=round(beta_est[,2],2)
)
    
```


```{r}

# kable(beta_est_table_pretty,booktabs =T,linesep="")

```

```{r preferencebyEUvote,fig.pos="t",fig.width=8,fig.height=6,fig.cap="Relative public preference for tax levers for Leave (blue squares) versus Remain (yellow circles) voters in the 2015 EU Referendum, in units of probability of supporting taxation via a given lever versus others in pairwise comparisons of revenue-equivalent increases and decreases.  Solid points and black label text indicate tax levers where the 95% interval for the difference excludes zero. \\label{preferenceEUref}"}

plot(0,0,xlim=c(-0.275,0.275),ylim=c(0.5,stan_data_leaveremain$N_levers+1),
     type="n",axes=FALSE,
     xlab="Relative Popularity of Tax Levers",ylab="")
abline(v = 0,col=rgb(0,0,0,0.5),lty=3)
axis(1)
sortorder <- order(beta_est[,1] + 0.5*beta_est[,2])
points(beta_est[sortorder,1],1:stan_data_leaveremain$N_levers-0.1,pch=1 + 15*remain_leave_diff_sig[sortorder],col=remain_col)
points(beta_est[sortorder,1]+beta_est[sortorder,2],1:stan_data_leaveremain$N_levers+0.1,pch=15*remain_leave_diff_sig[sortorder],col=leave_col)
for (k in 1:stan_data_leaveremain$N_levers) {
  lines(leave_int[,sortorder[k]],c(k,k)+0.1,col=leave_col)
  lines(remain_int[,sortorder[k]],c(k,k)-0.1,col=remain_col)
 if (leave_est[sortorder[k]] + remain_est[sortorder[k]] > 0) text(min(leave_int[1,sortorder[k]],remain_int[1,sortorder[k]]),k,long_labels[sortorder[k]],pos=2,cex=0.75,col=rgb(0,0,0,0.25+0.75*remain_leave_diff_sig[sortorder[k]])) else  text(max(leave_int[2,sortorder[k]],remain_int[2,sortorder[k]]),k,long_labels[sortorder[k]],pos=4,cex=0.75,col=rgb(0,0,0,0.25+0.55*remain_leave_diff_sig[sortorder[k]])) 
}

legend("topleft",legend = c("Leave","Remain"),
       col = c(leave_col,remain_col),
       pch = c(15,16),
       title = "EU Referendum vote",
       cex = 0.75, box.lty=0,title.adj = 0)

```



### Preferences by 2019 Voter Turnout


```{r}

load(file="posterior_turnout.Rdata")


alpha_est <- mean(extract(posterior_turnout)$alpha)
alpha_int <- quantile(extract(posterior_turnout)$alpha,c(0.025,0.975))

sigma_beta_est <- apply(extract(posterior_turnout)$sigma_beta,c(2),mean)
sigma_beta_int <- apply(extract(posterior_turnout)$sigma_beta,c(2),quantile,c(0.025,0.975))

beta_est <- apply(extract(posterior_turnout)$beta,c(2,3),mean)
beta_int <- apply(extract(posterior_turnout)$beta,c(2,3),quantile,c(0.025,0.975))

abstained_chains <- extract(posterior_turnout)$beta[,,1]
voted_chains <- extract(posterior_turnout)$beta[,,1] + extract(posterior_turnout)$beta[,,2]

abstained_est <- apply(abstained_chains,2,mean)
voted_est <- apply(voted_chains,2,mean)
abstained_int <- apply(abstained_chains,2,quantile,c(0.025,0.975))
voted_int <- apply(voted_chains,2,quantile,c(0.025,0.975))
abs_vote_diff_sig <- rowSums(sign(t(beta_int[,,2]))) != 0


beta_est_table_pretty <- data.frame(
 tax=lever_labels,
 intercept=round(beta_est[,1],2),
 voted=round(beta_est[,2],2)
)
    
```


```{r}

# kable(beta_est_table_pretty,booktabs =T,linesep="")

```



```{r preferencebyTurnout,fig.pos="t",fig.width=8,fig.height=6,fig.cap="Relative public preference for tax levers for 2019 non-voters (grey circles) versus 2019 voters (blue squares) voters, in units of probability of supporting taxation via a given lever versus others in pairwise comparisons of revenue-equivalent increases and decreases. Solid points and black label text indicate tax levers where the 95% interval for the difference excludes zero. \\label{preferenceTurnout}"}

plot(0,0,xlim=c(-0.275,0.275),ylim=c(0.5,stan_data_turnout$N_levers+1),
     type="n",axes=FALSE,
     xlab="Relative Popularity of Tax Levers",ylab="")
abline(v = 0,col=rgb(0,0,0,0.5),lty=3)
axis(1)
sortorder <- order(beta_est[,1] + 0.5*beta_est[,2])
points(beta_est[sortorder,1],1:stan_data_turnout$N_levers-0.1,pch=1 + 15*abs_vote_diff_sig[sortorder],col="gray")
points(beta_est[sortorder,1]+beta_est[sortorder,2],1:stan_data_turnout$N_levers+0.1,pch=15*abs_vote_diff_sig[sortorder],col="blue")
for (k in 1:stan_data_turnout$N_levers) {
  lines(abstained_int[,sortorder[k]],c(k,k)-0.1,col="gray")
  lines(voted_int[,sortorder[k]],c(k,k)+0.1,col="blue")
 if (abstained_est[sortorder[k]] + voted_est[sortorder[k]] > 0) text(min(abstained_int[1,sortorder[k]],voted_int[1,sortorder[k]]),k,long_labels[sortorder[k]],pos=2,cex=0.75,col=rgb(0,0,0,0.25+0.75*abs_vote_diff_sig[sortorder[k]])) else  text(max(abstained_int[2,sortorder[k]],voted_int[2,sortorder[k]]),k,long_labels[sortorder[k]],pos=4,cex=0.75,col=rgb(0,0,0,0.25+0.55*abs_vote_diff_sig[sortorder[k]])) 
}
legend("topleft",legend = c("Did not vote","Voted"),
       col = c("gray","blue"),
       pch = c(15,16),
       title = "2019 GE turnout",
       cex = 0.75,box.lty=0)


```


### Preferences by Party Choice (additional categories)


```{r}

load(file="posterior_vote19.Rdata")


alpha_est <- mean(extract(posterior_vote19)$alpha)
alpha_int <- quantile(extract(posterior_vote19)$alpha,c(0.025,0.975))

sigma_beta_est <- apply(extract(posterior_vote19)$sigma_beta,c(2),mean)
sigma_beta_int <- apply(extract(posterior_vote19)$sigma_beta,c(2),quantile,c(0.025,0.975))

beta_est <- apply(extract(posterior_vote19)$beta,c(2,3),mean)
beta_int <- apply(extract(posterior_vote19)$beta,c(2,3),quantile,c(0.025,0.975))

con_chains <- extract(posterior_vote19)$beta[,,1]
lab_chains <- extract(posterior_vote19)$beta[,,1] + extract(posterior_vote19)$beta[,,2]
ld_chains <- extract(posterior_vote19)$beta[,,1] + extract(posterior_vote19)$beta[,,3]
other_chains <- extract(posterior_vote19)$beta[,,1] + extract(posterior_vote19)$beta[,,4]
none_chains <- extract(posterior_vote19)$beta[,,1] + extract(posterior_vote19)$beta[,,5]

con_est <- apply(con_chains,2,mean)
lab_est <- apply(lab_chains,2,mean)
ld_est <- apply(ld_chains,2,mean)
other_est <- apply(other_chains,2,mean)
none_est <- apply(none_chains,2,mean)

con_int <- apply(con_chains,2,quantile,c(0.025,0.975))
lab_int <- apply(lab_chains,2,quantile,c(0.025,0.975))
ld_int <- apply(ld_chains,2,quantile,c(0.025,0.975))
other_int <- apply(other_chains,2,quantile,c(0.025,0.975))
none_int <- apply(none_chains,2,quantile,c(0.025,0.975))

con_lab_diff_sig <- rowSums(sign(t(beta_int[,,2]))) != 0
con_ld_diff_sig <- rowSums(sign(t(beta_int[,,3]))) != 0
con_other_diff_sig <- rowSums(sign(t(beta_int[,,4]))) != 0
con_none_diff_sig <- rowSums(sign(t(beta_int[,,5]))) != 0

beta_est_table_pretty <- data.frame(
 tax=lever_labels,
 intercept=round(beta_est[,1],2),
 lab=round(beta_est[,2],2),
 ld=round(beta_est[,3],2),
 other=round(beta_est[,4],2),
 none=round(beta_est[,5],2)
)
    
```


```{r}

#kable(beta_est_table_pretty,booktabs =T,linesep="")

```


```{r preferencebyvote19full,fig.pos="t",fig.width=8,fig.height=8,fig.cap="Relative public preference for tax levers for Conservative (blue squares), Labour (red circles), Liberal Democrat (yellow triangles) voters, voters of other parties (dark gray diamonds) and non-voters (light gray inversed triangles) in the 2019 General Election in units of probability of supporting taxation via a given lever versus others in pairwise comparisons of revenue-equivalent increases and decreases.  Solid points and black label text indicate tax levers where the 95% interval for the party difference excludes zero. \\label{preferencesbyvote2019full}"}

plot(0,0,xlim=c(-0.275,0.275),ylim=c(0.5,(stan_data_vote19$N_levers+1)*1.4),
     type="n",axes=FALSE,
     xlab="Relative Popularity of Tax Levers",ylab="")
abline(v = 0,col=rgb(0,0,0,0.5),lty=3)
axis(1)
sortorder <- order(beta_est[,1] + rowSums(0.25*beta_est[,2:5]))
points(beta_est[sortorder,1],1:stan_data_vote19$N_levers*1.4,pch=15*(con_lab_diff_sig[sortorder]|con_ld_diff_sig[sortorder]),col=con_col)
points(beta_est[sortorder,1]+beta_est[sortorder,2],1:stan_data_vote19$N_levers*1.4+0.4,pch=1+15*con_lab_diff_sig[sortorder],col=lab_col)
points(beta_est[sortorder,1]+beta_est[sortorder,3],1:stan_data_vote19$N_levers*1.4+0.2,pch=2+15*con_ld_diff_sig[sortorder],col=ld_col)
points(beta_est[sortorder,1]+beta_est[sortorder,4],1:stan_data_vote19$N_levers*1.4-0.2,pch=5+13*con_other_diff_sig[sortorder],col="darkgray")
points(beta_est[sortorder,1]+beta_est[sortorder,5],1:stan_data_vote19$N_levers*1.4-0.4,pch=6+19*con_none_diff_sig[sortorder],col="lightgray")
for (k in 1:stan_data_vote19$N_levers) {
    lines(con_int[,sortorder[k]],c(k,k)*1.4,col=con_col)
  lines(lab_int[,sortorder[k]],c(k,k)*1.4+0.4,col=lab_col)
  lines(ld_int[,sortorder[k]],c(k,k)*1.4+0.2,col=ld_col)
  lines(other_int[,sortorder[k]],c(k,k)*1.4-0.2,col="darkgray")
  lines(none_int[,sortorder[k]],c(k,k)*1.4-0.4,col="lightgray")
 if (lab_est[sortorder[k]] + con_est[sortorder[k]] + ld_est[sortorder[k]] > 0) text(min(lab_int[1,sortorder[k]],con_int[1,sortorder[k]],ld_int[1,sortorder[k]], other_int[1,sortorder[k]], none_int[1,sortorder[k]] ),k*1.4,long_labels[sortorder[k]],pos=2,cex=0.75,col=rgb(0,0,0,0.25+0.75*(con_lab_diff_sig[sortorder[k]]|con_ld_diff_sig[sortorder[k]]))) else  text(max(lab_int[2,sortorder[k]],con_int[2,sortorder[k]],ld_int[2,sortorder[k]], other_int[2,sortorder[k]], none_int[2,sortorder[k]] ),k*1.4,long_labels[sortorder[k]],pos=4,cex=0.75,col=rgb(0,0,0,0.25+0.75*(con_lab_diff_sig[sortorder[k]]|con_ld_diff_sig[sortorder[k]]))) 
}
legend("topleft",legend = c("Conservative","Labour","Liberal Democrat","Other parties","Non-voters"),
       col = c(con_col,lab_col,ld_col,"darkgray","lightgray"),
       pch = c(15,16,17,18,25),
       title = "2019 GE vote",
       cex = 0.75, box.lty=0, title.adj = 0)
```















### Preferences by Income (additional categories)


```{r}

load(file="posterior_income2.Rdata")


alpha_est <- mean(extract(posterior_income2)$alpha)
alpha_int <- quantile(extract(posterior_income2)$alpha,c(0.025,0.975))

sigma_beta_est <- apply(extract(posterior_income2)$sigma_beta,c(2),mean)
sigma_beta_int <- apply(extract(posterior_income2)$sigma_beta,c(2),quantile,c(0.025,0.975))

beta_est <- apply(extract(posterior_income2)$beta,c(2,3),mean)
beta_int <- apply(extract(posterior_income2)$beta,c(2,3),quantile,c(0.025,0.975))

under25k_chains <- extract(posterior_income2)$beta[,,1] + extract(posterior_income2)$beta[,,2]
under60kover25k_chains <-  extract(posterior_income2)$beta[,,1]
over60k_chains <- extract(posterior_income2)$beta[,,1] + extract(posterior_income2)$beta[,,3]
refused_chains <- extract(posterior_income2)$beta[,,1] + extract(posterior_income2)$beta[,,4]

under25k_est <- apply(under25k_chains,2,mean)
under60kover25k_est <- apply(under60kover25k_chains,2,mean)
over60k_est <- apply(over60k_chains,2,mean)
refused_est <- apply(refused_chains,2,mean)

under25k_int <- apply(under25k_chains,2,quantile,c(0.025,0.975))
under60kover25k_int <- apply(under60kover25k_chains,2,quantile,c(0.025,0.975))
over60k_int <- apply(over60k_chains,2,quantile,c(0.025,0.975))
refused_int <- apply(refused_chains,2,quantile,c(0.025,0.975))


under25k_under60kover25k_diff_sig <- rowSums(sign(t(beta_int[,,2]))) != 0
over60k_under60kover25k_diff_sig <- rowSums(sign(t(beta_int[,,3]))) != 0
under25k_over60k_diff_sig <- rowSums(sign(t(apply(over60k_chains - under25k_chains,c(2),quantile,c(0.025,0.975))))) != 0


beta_est_table_pretty <- data.frame(
 tax=lever_labels,
 intercept=round(beta_est[,1],2),
 under25k=round(beta_est[,2],2),
 over60k=round(beta_est[,3],2),
 refused=round(beta_est[,4],2)
)
    
```


```{r preferencebyIncome2,fig.pos="b",fig.width=8,fig.height=8,fig.cap="Relative public preference for tax levers for respondents with household incomes above 60k (blue triangles), between 25k and 60k (purple circles), below 25k (red squares), and those who did not answer the income item (grey diamonds), in units of probability of supporting taxation via a given lever versus others.  Solid points and black label text indicate tax levers where the 95% interval for an income category difference excludes zero.  \\label{preferenceIncome2}"}

plot(0,0,xlim=c(-0.275,0.275),ylim=c(0.5,stan_data_income2$N_levers+1),
     type="n",axes=FALSE,
     xlab="Relative Popularity of Tax Levers",ylab="")
abline(v = 0,col=rgb(0,0,0,0.5),lty=3)
axis(1)
sortorder <- order(beta_est[,1])

  points(under25k_est[sortorder],1:stan_data_income2$N_levers-0.3,col="red",pch=15*(under25k_under60kover25k_diff_sig[sortorder]|under25k_over60k_diff_sig[sortorder]))
  points(under60kover25k_est[sortorder],1:stan_data_income2$N_levers-0.1,col="purple",pch=1+15*(under25k_under60kover25k_diff_sig[sortorder]|over60k_under60kover25k_diff_sig[sortorder]))
  points(over60k_est[sortorder],1:stan_data_income2$N_levers+0.1,col="blue",pch=2+15*(over60k_under60kover25k_diff_sig[sortorder]|under25k_over60k_diff_sig[sortorder]))
  points(refused_est[sortorder],1:stan_data_income2$N_levers+0.3,col="grey",pch=5)
  
for (k in 1:stan_data_income2$N_levers) {
  lines(under25k_int[,sortorder[k]],c(k,k)-0.3,col="red")
  lines(under60kover25k_int[,sortorder[k]],c(k,k)-0.1,col="purple")
  lines(over60k_int[,sortorder[k]],c(k,k)+0.1,col="blue")
  lines(refused_int[,sortorder[k]],c(k,k)+0.3,col="grey")
  
  
 # if (under25k_est[sortorder[k]] + over60k_est[sortorder[k]] > 0) text(min(under25k_int[1,sortorder[k]],under60kover25k_int[1,sortorder[k]],over60k_int[1,sortorder[k]],refused_int[1,sortorder[k]]),k,long_labels[sortorder[k]],pos=2,cex=0.75,col=rgb(0,0,0,1)) else  text(max(under25k_int[2,sortorder[k]],under60kover25k_int[1,sortorder[k]],over60k_int[2,sortorder[k]],refused_int[1,sortorder[k]]),k,long_labels[sortorder[k]],pos=4,cex=0.75,col=rgb(0,0,0,1))   
  
   if (under25k_est[sortorder[k]] + over60k_est[sortorder[k]] > 0) text(min(under25k_int[1,sortorder[k]],under60kover25k_int[1,sortorder[k]],over60k_int[1,sortorder[k]], refused_int[1,sortorder[k]] ),k,long_labels[sortorder[k]],pos=2,cex=0.75,col=rgb(0,0,0,0.25+0.75*(under25k_under60kover25k_diff_sig[sortorder[k]]|under25k_over60k_diff_sig[sortorder[k]]|over60k_under60kover25k_diff_sig[sortorder[k]]))) else  text(max(under25k_int[2,sortorder[k]],under60kover25k_int[2,sortorder[k]],over60k_int[2,sortorder[k]], refused_int[2,sortorder[k]]),k,long_labels[sortorder[k]],pos=4,cex=0.75,col=rgb(0,0,0,0.25+0.75*(under25k_under60kover25k_diff_sig[sortorder[k]]|under25k_over60k_diff_sig[sortorder[k]]|over60k_under60kover25k_diff_sig[sortorder[k]]))) 
  
}


legend("topleft",legend = c("below 25k","between 25k and 60k","above 60k","not answered"),
       col = c("red","purple","blue","gray"),
       pch = c(15,16,17,18),
       title = "Household income",
       cex = 0.75, box.lty=0,title.adj = 0)

```




### Preferences by Gender


```{r}

load(file="posterior_gender.Rdata")


alpha_est <- mean(extract(posterior_gender)$alpha)
alpha_int <- quantile(extract(posterior_gender)$alpha,c(0.025,0.975))

sigma_beta_est <- apply(extract(posterior_gender)$sigma_beta,c(2),mean)
sigma_beta_int <- apply(extract(posterior_gender)$sigma_beta,c(2),quantile,c(0.025,0.975))

beta_est <- apply(extract(posterior_gender)$beta,c(2,3),mean)
beta_int <- apply(extract(posterior_gender)$beta,c(2,3),quantile,c(0.025,0.975))

male_chains <- extract(posterior_gender)$beta[,,1]
female_chains <- extract(posterior_gender)$beta[,,1] + extract(posterior_gender)$beta[,,2]

male_est <- apply(male_chains,2,mean)
female_est <- apply(female_chains,2,mean)
male_int <- apply(male_chains,2,quantile,c(0.025,0.975))
female_int <- apply(female_chains,2,quantile,c(0.025,0.975))
male_female_diff_sig <- rowSums(sign(t(beta_int[,,2]))) != 0


beta_est_table_pretty <- data.frame(
 tax=lever_labels,
 intercept=round(beta_est[,1],2),
 female=round(beta_est[,2],2)
)
    
```


```{r}

# kable(beta_est_table_pretty,booktabs =T,linesep="")

```

```{r preferencebyGender,fig.pos="t",fig.width=8,fig.height=6,fig.cap="Relative public preference for tax levers for men (pink circles) versus women (blue squares), in units of probability of supporting taxation via a given lever versus others in pairwise comparisons of revenue-equivalent increases and decreases.  Solid points and black label text indicate tax levers where the 95% interval for the gender difference excludes zero. \\label{preferenceGender}"}

plot(0,0,xlim=c(-0.275,0.275),ylim=c(0.5,stan_data_gender$N_levers+1),
     type="n",axes=FALSE,
     xlab="Relative Popularity of Tax Levers",ylab="")
abline(v = 0,col=rgb(0,0,0,0.5),lty=3)
axis(1)
sortorder <- order(beta_est[,1] + 0.5*beta_est[,2])
points(beta_est[sortorder,1],1:stan_data_gender$N_levers+0.1,
       pch=1 + 15*male_female_diff_sig[sortorder],col="#E26A89")
points(beta_est[sortorder,1]+beta_est[sortorder,2],1:stan_data_gender$N_levers-0.1,
       pch=15*male_female_diff_sig[sortorder],col="#005BB5")
for (k in 1:stan_data_gender$N_levers) {
  lines(male_int[,sortorder[k]],c(k,k)+0.1,col="#E26A89")
  lines(female_int[,sortorder[k]],c(k,k)-0.1,col="#005BB5")
 if (male_est[sortorder[k]] + female_est[sortorder[k]] > 0) text(min(male_int[1,sortorder[k]],female_int[1,sortorder[k]]),k,long_labels[sortorder[k]],pos=2,cex=0.75,col=rgb(0,0,0,0.25+0.75*male_female_diff_sig[sortorder[k]])) else  text(max(male_int[2,sortorder[k]],female_int[2,sortorder[k]]),k,long_labels[sortorder[k]],pos=4,cex=0.75,col=rgb(0,0,0,0.25+0.55*male_female_diff_sig[sortorder[k]])) 
}

legend("topleft",legend = c("Men","Women"),
       col = c("#E26A89","#005BB5"),
       pch = c(16,15),
       cex = 0.75, box.lty=0,title.adj = 0)

```




### Preferences by Education Level


```{r}

load(file="posterior_degree.Rdata")


alpha_est <- mean(extract(posterior_degree)$alpha)
alpha_int <- quantile(extract(posterior_degree)$alpha,c(0.025,0.975))

sigma_beta_est <- apply(extract(posterior_degree)$sigma_beta,c(2),mean)
sigma_beta_int <- apply(extract(posterior_degree)$sigma_beta,c(2),quantile,c(0.025,0.975))

beta_est <- apply(extract(posterior_degree)$beta,c(2,3),mean)
beta_int <- apply(extract(posterior_degree)$beta,c(2,3),quantile,c(0.025,0.975))

nodegree_chains <- extract(posterior_degree)$beta[,,1]
degree_chains <- extract(posterior_degree)$beta[,,1] + extract(posterior_degree)$beta[,,2]

nodegree_est <- apply(nodegree_chains,2,mean)
degree_est <- apply(degree_chains,2,mean)
nodegree_int <- apply(nodegree_chains,2,quantile,c(0.025,0.975))
degree_int <- apply(degree_chains,2,quantile,c(0.025,0.975))
nodegree_degree_diff_sig <- rowSums(sign(t(beta_int[,,2]))) != 0


beta_est_table_pretty <- data.frame(
 tax=lever_labels,
 intercept=round(beta_est[,1],2),
 degree=round(beta_est[,2],2)
)
    
```


```{r}

# kable(beta_est_table_pretty,booktabs =T,linesep="")

```


```{r preferencebyEducation,fig.pos="t",fig.width=8,fig.height=6,fig.cap="Relative public preference for tax levers for respondents without (blue circles) versus with university degree (purple squares), in units of probability of supporting taxation via a given lever versus others in pairwise comparisons of revenue-equivalent increases and decreases.  Solid points and black label text indicate tax levers where the 95% interval for the difference excludes zero. \\label{preferenceEducation}"}

plot(0,0,xlim=c(-0.275,0.275),ylim=c(0.5,stan_data_degree$N_levers+1),
     type="n",axes=FALSE,
     xlab="Relative Popularity of Tax Levers",ylab="")
abline(v = 0,col=rgb(0,0,0,0.5),lty=3)
axis(1)
sortorder <- order(beta_est[,1] + 0.5*beta_est[,2])
points(beta_est[sortorder,1],1:stan_data_degree$N_levers+0.1,
       pch=1 + 15*nodegree_degree_diff_sig[sortorder],col="#66cbff")
points(beta_est[sortorder,1]+beta_est[sortorder,2],1:stan_data_degree$N_levers-0.1,
       pch=15*nodegree_degree_diff_sig[sortorder],col="#612899")
for (k in 1:stan_data_degree$N_levers) {
  lines(nodegree_int[,sortorder[k]],c(k,k)+0.1,col="#66cbff")
  lines(degree_int[,sortorder[k]],c(k,k)-0.1,col="#612899")
 if (nodegree_est[sortorder[k]] + degree_est[sortorder[k]] > 0) text(min(nodegree_int[1,sortorder[k]],degree_int[1,sortorder[k]]),k,long_labels[sortorder[k]],pos=2,cex=0.75,col=rgb(0,0,0,0.25+0.75*nodegree_degree_diff_sig[sortorder[k]])) else  text(max(nodegree_int[2,sortorder[k]],degree_int[2,sortorder[k]]),k,long_labels[sortorder[k]],pos=4,cex=0.75,col=rgb(0,0,0,0.25+0.55*nodegree_degree_diff_sig[sortorder[k]])) 
}

legend("topleft",legend = c("No university degree","University degree"),
       col = c("#66cbff","#612899"),
       pch = c(16,15),
       cex = 0.75, box.lty=0,title.adj = 0)

```


```{=latex}
\begin{landscape}
```

### Preference Multivariate Analysis


```{r}

load(file="posterior_multivariate.Rdata")


alpha_est <- mean(extract(posterior_multivariate)$alpha)
alpha_int <- quantile(extract(posterior_multivariate)$alpha,c(0.025,0.975))

sigma_beta_est <- apply(extract(posterior_multivariate)$sigma_beta,c(2),mean)
sigma_beta_int <- apply(extract(posterior_multivariate)$sigma_beta,c(2),quantile,c(0.025,0.975))

beta_est <- apply(extract(posterior_multivariate)$beta,c(2,3),mean)
beta_int <- apply(extract(posterior_multivariate)$beta,c(2,3),quantile,c(0.025,0.975))

# Comparison to bivariate estimates
load(file="posterior_income.Rdata")
income_est <- apply(extract(posterior_income)$beta,c(2,3),mean)
load(file="posterior_degree.Rdata")
degree_est <- apply(extract(posterior_degree)$beta,c(2,3),mean)
load(file="posterior_gender.Rdata")
gender_est <- apply(extract(posterior_gender)$beta,c(2,3),mean)
load(file="posterior_leaveremain.Rdata")
ref_est <- apply(extract(posterior_leaveremain)$beta,c(2,3),mean)
load(file="posterior_vote19.Rdata")
vote19_est <- apply(extract(posterior_vote19)$beta,c(2,3),mean)


beta_est_table_pretty <- data.frame(
 tax=c(lever_labels,"Correlation with bivariate estimates"),
 intercept=c(round(beta_est[,1],3),NA),
 over45k=c(round(beta_est[,2],3),round(cor(beta_est[,2],income_est[,2]),3)),
 refused=c(round(beta_est[,3],3),round(cor(beta_est[,3],income_est[,3]),3)),
 degree=c(round(beta_est[,4],3),round(cor(beta_est[,4],degree_est[,2]),3)),
 female=c(round(beta_est[,5],3),round(cor(beta_est[,5],gender_est[,2]),3)),
 leave=c(round(beta_est[,6],3),round(cor(beta_est[,6],ref_est[,2]),3)),
 lab=c(round(beta_est[,7],3),round(cor(beta_est[,7],vote19_est[,2]),3)),
 ld=c(round(beta_est[,8],3),round(cor(beta_est[,8],vote19_est[,3]),3)),
 other=c(round(beta_est[,9],3),round(cor(beta_est[,9],vote19_est[,4]),3)),
 none=c(round(beta_est[,10],3),round(cor(beta_est[,10],vote19_est[,5]),3))
)

# R2 
survey$predicted <- get_posterior_mean(posterior_multivariate,"mu")[,3]
r_squared <- cov.wt(survey[,c("predicted","choice")],wt=survey$W8,cor = T)$cor[1,2]^2

```


```{r}
options(knitr.kable.NA = '')
kable(beta_est_table_pretty,booktabs =T,longtable = TRUE,linesep="",
      align = c("l",rep("r",ncol(beta_est)))) %>% 
  kable_styling(latex_options = c("HOLD_position","repeat_header", "striped"),
                    full_width = F) %>%
  row_spec(length(long_labels),hline_after = T)

```

```{=latex}
\end{landscape}
```




